Details

    • Type: New Feature
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: 3.0.0-alpha1
    • Fix Version/s: 3.0.0-alpha1
    • Component/s: scripts
    • Labels:
    • Target Version/s:
    • Hadoop Flags:
      Incompatible change
    • Release Note:
      Hide
      <!-- markdown -->
      The Hadoop shell scripts have been rewritten to fix many long standing bugs and include some new features. While an eye has been kept towards compatibility, some changes may break existing installations.

      INCOMPATIBLE CHANGES:

      * The pid and out files for secure daemons have been renamed to include the appropriate ${HADOOP\_IDENT\_STR}. This should allow, with proper configurations in place, for multiple versions of the same secure daemon to run on a host. Additionally, pid files are now created when daemons are run in interactive mode. This will also prevent the accidental starting of two daemons with the same configuration prior to launching java (i.e., "fast fail" without having to wait for socket opening).
      * All Hadoop shell script subsystems now execute hadoop-env.sh, which allows for all of the environment variables to be in one location. This was not the case previously.
      * The default content of *-env.sh has been significantly altered, with the majority of defaults moved into more protected areas inside the code. Additionally, these files do not auto-append anymore; setting a variable on the command line prior to calling a shell command must contain the entire content, not just any extra settings. This brings Hadoop more in-line with the vast majority of other software packages.
      * All HDFS\_\*, YARN\_\*, and MAPRED\_\* environment variables act as overrides to their equivalent HADOOP\_\* environment variables when 'hdfs', 'yarn', 'mapred', and related commands are executed. Previously, these were separated out which meant a significant amount of duplication of common settings.
      * hdfs-config.sh and hdfs-config.cmd were inadvertently duplicated into libexec and sbin. The sbin versions have been removed.
      * The log4j settings forcibly set by some *-daemon.sh commands have been removed. These settings are now configurable in the \*-env.sh files via \*\_OPT.
      * Support for various undocumented YARN log4j.properties files has been removed.
      * Support for ${HADOOP\_MASTER} and the related rsync code have been removed.
      * The undocumented and unused yarn.id.str Java property has been removed.
      * The unused yarn.policy.file Java property has been removed.
      * We now require bash v3 (released July 27, 2004) or better in order to take advantage of better regex handling and ${BASH\_SOURCE}. POSIX sh will not work.
      * Support for --script has been removed. We now use ${HADOOP\_\*\_PATH} or ${HADOOP\_PREFIX} to find the necessary binaries. (See other note regarding ${HADOOP\_PREFIX} auto discovery.)
      * Non-existent classpaths, ld.so library paths, JNI library paths, etc, will be ignored and stripped from their respective environment settings.

      NEW FEATURES:

      * Daemonization has been moved from *-daemon.sh to the bin commands via the --daemon option. Simply use --daemon start to start a daemon, --daemon stop to stop a daemon, and --daemon status to set $? to the daemon's status. The return code for status is LSB-compatible. For example, 'hdfs --daemon start namenode'.
      * It is now possible to override some of the shell code capabilities to provide site specific functionality without replacing the shipped versions. Replacement functions should go into the new hadoop-user-functions.sh file.
      * A new option called --buildpaths will attempt to add developer build directories to the classpath to allow for in source tree testing.
      * Operations which trigger ssh connections can now use pdsh if installed. ${HADOOP\_SSH\_OPTS} still gets applied.
      * Added distch and jnipath subcommands to the hadoop command.
      * Shell scripts now support a --debug option which will report basic information on the construction of various environment variables, java options, classpath, etc. to help in configuration debugging.

      BUG FIXES:

      * ${HADOOP\_CONF\_DIR} is now properly honored everywhere, without requiring symlinking and other such tricks.
      * ${HADOOP\_CONF\_DIR}/hadoop-layout.sh is now documented with a provided hadoop-layout.sh.example file.
      * Shell commands should now work properly when called as a relative path, without ${HADOOP\_PREFIX} being defined, and as the target of bash -x for debugging. If ${HADOOP\_PREFIX} is not set, it will be automatically determined based upon the current location of the shell library. Note that other parts of the extended Hadoop ecosystem may still require this environment variable to be configured.
      * Operations which trigger ssh will now limit the number of connections to run in parallel to ${HADOOP\_SSH\_PARALLEL} to prevent memory and network exhaustion. By default, this is set to 10.
      * ${HADOOP\_CLIENT\_OPTS} support has been added to a few more commands.
      * Some subcommands were not listed in the usage.
      * Various options on hadoop command lines were supported inconsistently. These have been unified into hadoop-config.sh. --config is still required to be first, however.
      * ulimit logging for secure daemons no longer assumes /bin/bash but does assume bash is on the command line path.
      * Removed references to some Yahoo! specific paths.
      * Removed unused slaves.sh from YARN build tree.
      * Many exit states have been changed to reflect reality.
      * Shell level errors now go to STDERR. Before, many of them went incorrectly to STDOUT.
      * CDPATH with a period (.) should no longer break the scripts.
      * The scripts no longer try to chown directories.
      * If ${JAVA\_HOME} is not set on OS X, it now properly detects it instead of throwing an error.

      IMPROVEMENTS:

      * The *.out files are now appended instead of overwritten to allow for external log rotation.
      * The style and layout of the scripts is much more consistent across subprojects.
      * More of the shell code is now commented.
      * Significant amounts of redundant code have been moved into a new file called hadoop-functions.sh.
      * The various *-env.sh have been massively changed to include documentation and examples on what can be set, ramifications of setting, etc. for all variables that are expected to be set by a user.
      * There is now some trivial de-duplication and sanitization of the classpath and JVM options. This allows, amongst other things, for custom settings in \*\_OPTS for Hadoop daemons to override defaults and other generic settings (i.e., ${HADOOP\_OPTS}). This is particularly relevant for Xmx settings, as one can now set them in _OPTS and ignore the heap specific options for daemons which force the size in megabytes.
      * Subcommands have been alphabetized in both usage and in the code.
      * All/most of the functionality provided by the sbin/* commands has been moved to either their bin/ equivalents or made into functions. The rewritten versions of these commands are now wrappers to maintain backward compatibility.
      * Usage information is given with the following options/subcommands for all scripts using the common framework: --? -? ? --help -help -h help
      * Several generic environment variables have been added to provide a common configuration for pids, logs, and their security equivalents. The older versions still act as overrides to these generic versions.
      * Groundwork has been laid to allow for custom secure daemon setup using something other than jsvc (e.g., pfexec on Solaris).
      * Scripts now test and report better error messages for various states of the log and pid dirs on daemon startup. Before, unprotected shell errors would be displayed to the user.
      Show
      <!-- markdown --> The Hadoop shell scripts have been rewritten to fix many long standing bugs and include some new features. While an eye has been kept towards compatibility, some changes may break existing installations. INCOMPATIBLE CHANGES: * The pid and out files for secure daemons have been renamed to include the appropriate ${HADOOP\_IDENT\_STR}. This should allow, with proper configurations in place, for multiple versions of the same secure daemon to run on a host. Additionally, pid files are now created when daemons are run in interactive mode. This will also prevent the accidental starting of two daemons with the same configuration prior to launching java (i.e., "fast fail" without having to wait for socket opening). * All Hadoop shell script subsystems now execute hadoop-env.sh, which allows for all of the environment variables to be in one location. This was not the case previously. * The default content of *-env.sh has been significantly altered, with the majority of defaults moved into more protected areas inside the code. Additionally, these files do not auto-append anymore; setting a variable on the command line prior to calling a shell command must contain the entire content, not just any extra settings. This brings Hadoop more in-line with the vast majority of other software packages. * All HDFS\_\*, YARN\_\*, and MAPRED\_\* environment variables act as overrides to their equivalent HADOOP\_\* environment variables when 'hdfs', 'yarn', 'mapred', and related commands are executed. Previously, these were separated out which meant a significant amount of duplication of common settings. * hdfs-config.sh and hdfs-config.cmd were inadvertently duplicated into libexec and sbin. The sbin versions have been removed. * The log4j settings forcibly set by some *-daemon.sh commands have been removed. These settings are now configurable in the \*-env.sh files via \*\_OPT. * Support for various undocumented YARN log4j.properties files has been removed. * Support for ${HADOOP\_MASTER} and the related rsync code have been removed. * The undocumented and unused yarn.id.str Java property has been removed. * The unused yarn.policy.file Java property has been removed. * We now require bash v3 (released July 27, 2004) or better in order to take advantage of better regex handling and ${BASH\_SOURCE}. POSIX sh will not work. * Support for --script has been removed. We now use ${HADOOP\_\*\_PATH} or ${HADOOP\_PREFIX} to find the necessary binaries. (See other note regarding ${HADOOP\_PREFIX} auto discovery.) * Non-existent classpaths, ld.so library paths, JNI library paths, etc, will be ignored and stripped from their respective environment settings. NEW FEATURES: * Daemonization has been moved from *-daemon.sh to the bin commands via the --daemon option. Simply use --daemon start to start a daemon, --daemon stop to stop a daemon, and --daemon status to set $? to the daemon's status. The return code for status is LSB-compatible. For example, 'hdfs --daemon start namenode'. * It is now possible to override some of the shell code capabilities to provide site specific functionality without replacing the shipped versions. Replacement functions should go into the new hadoop-user-functions.sh file. * A new option called --buildpaths will attempt to add developer build directories to the classpath to allow for in source tree testing. * Operations which trigger ssh connections can now use pdsh if installed. ${HADOOP\_SSH\_OPTS} still gets applied. * Added distch and jnipath subcommands to the hadoop command. * Shell scripts now support a --debug option which will report basic information on the construction of various environment variables, java options, classpath, etc. to help in configuration debugging. BUG FIXES: * ${HADOOP\_CONF\_DIR} is now properly honored everywhere, without requiring symlinking and other such tricks. * ${HADOOP\_CONF\_DIR}/hadoop-layout.sh is now documented with a provided hadoop-layout.sh.example file. * Shell commands should now work properly when called as a relative path, without ${HADOOP\_PREFIX} being defined, and as the target of bash -x for debugging. If ${HADOOP\_PREFIX} is not set, it will be automatically determined based upon the current location of the shell library. Note that other parts of the extended Hadoop ecosystem may still require this environment variable to be configured. * Operations which trigger ssh will now limit the number of connections to run in parallel to ${HADOOP\_SSH\_PARALLEL} to prevent memory and network exhaustion. By default, this is set to 10. * ${HADOOP\_CLIENT\_OPTS} support has been added to a few more commands. * Some subcommands were not listed in the usage. * Various options on hadoop command lines were supported inconsistently. These have been unified into hadoop-config.sh. --config is still required to be first, however. * ulimit logging for secure daemons no longer assumes /bin/bash but does assume bash is on the command line path. * Removed references to some Yahoo! specific paths. * Removed unused slaves.sh from YARN build tree. * Many exit states have been changed to reflect reality. * Shell level errors now go to STDERR. Before, many of them went incorrectly to STDOUT. * CDPATH with a period (.) should no longer break the scripts. * The scripts no longer try to chown directories. * If ${JAVA\_HOME} is not set on OS X, it now properly detects it instead of throwing an error. IMPROVEMENTS: * The *.out files are now appended instead of overwritten to allow for external log rotation. * The style and layout of the scripts is much more consistent across subprojects. * More of the shell code is now commented. * Significant amounts of redundant code have been moved into a new file called hadoop-functions.sh. * The various *-env.sh have been massively changed to include documentation and examples on what can be set, ramifications of setting, etc. for all variables that are expected to be set by a user. * There is now some trivial de-duplication and sanitization of the classpath and JVM options. This allows, amongst other things, for custom settings in \*\_OPTS for Hadoop daemons to override defaults and other generic settings (i.e., ${HADOOP\_OPTS}). This is particularly relevant for Xmx settings, as one can now set them in _OPTS and ignore the heap specific options for daemons which force the size in megabytes. * Subcommands have been alphabetized in both usage and in the code. * All/most of the functionality provided by the sbin/* commands has been moved to either their bin/ equivalents or made into functions. The rewritten versions of these commands are now wrappers to maintain backward compatibility. * Usage information is given with the following options/subcommands for all scripts using the common framework: --? -? ? --help -help -h help * Several generic environment variables have been added to provide a common configuration for pids, logs, and their security equivalents. The older versions still act as overrides to these generic versions. * Groundwork has been laid to allow for custom secure daemon setup using something other than jsvc (e.g., pfexec on Solaris). * Scripts now test and report better error messages for various states of the log and pid dirs on daemon startup. Before, unprotected shell errors would be displayed to the user.

      Description

      Umbrella JIRA for shell script rewrite. See more-info.txt for more details.

      1. HADOOP-9902.patch
        142 kB
        Allen Wittenauer
      2. HADOOP-9902.txt
        86 kB
        Allen Wittenauer
      3. hadoop-9902-1.patch
        101 kB
        Allen Wittenauer
      4. HADOOP-9902-10.patch
        190 kB
        Allen Wittenauer
      5. HADOOP-9902-11.patch
        191 kB
        Allen Wittenauer
      6. HADOOP-9902-12.patch
        191 kB
        Allen Wittenauer
      7. HADOOP-9902-13.patch
        191 kB
        Allen Wittenauer
      8. HADOOP-9902-13-branch-2.patch
        190 kB
        Allen Wittenauer
      9. HADOOP-9902-14.patch
        193 kB
        Allen Wittenauer
      10. HADOOP-9902-15.patch
        193 kB
        Allen Wittenauer
      11. HADOOP-9902-16.patch
        193 kB
        Allen Wittenauer
      12. HADOOP-9902-2.patch
        151 kB
        Allen Wittenauer
      13. HADOOP-9902-3.patch
        165 kB
        Allen Wittenauer
      14. HADOOP-9902-4.patch
        185 kB
        Allen Wittenauer
      15. HADOOP-9902-5.patch
        186 kB
        Allen Wittenauer
      16. HADOOP-9902-6.patch
        186 kB
        Allen Wittenauer
      17. HADOOP-9902-7.patch
        187 kB
        Allen Wittenauer
      18. HADOOP-9902-8.patch
        187 kB
        Allen Wittenauer
      19. HADOOP-9902-9.patch
        190 kB
        Allen Wittenauer
      20. more-info.txt
        3 kB
        Allen Wittenauer

        Issue Links

          Activity

          Hide
          drankye Kai Zheng added a comment -

          It's great to see this.HADOOP-9873 reported an issue better to be cleaned here also since it's for the long term.

          Show
          drankye Kai Zheng added a comment - It's great to see this. HADOOP-9873 reported an issue better to be cleaned here also since it's for the long term.
          Hide
          aw Allen Wittenauer added a comment -

          Just to give an idea of what I'm thinking, here is a sample. Note this is a) not even close to final, b) likely has bugs, c) is very incomplete, and d) hasn't been fully optimized at all.

          This is for 2.1.0. Sorry for not being in patch format, but I'm not at that stage yet.

          Show
          aw Allen Wittenauer added a comment - Just to give an idea of what I'm thinking, here is a sample. Note this is a) not even close to final, b) likely has bugs, c) is very incomplete, and d) hasn't been fully optimized at all. This is for 2.1.0. Sorry for not being in patch format, but I'm not at that stage yet.
          Hide
          aw Allen Wittenauer added a comment -

          Adding a bunch of links to JIRAs for xref to when various things got added.

          A quick read leaves me with one impression: YARN is incredibly inconsistent and it's attempts to "make things easier" have actually made things harder for both the user and the developer. Worse, a lot of the stuff is completely undocumented outside of JIRAs. I don't know if this situation is salvageable without undoing some of this nonsense.

          Show
          aw Allen Wittenauer added a comment - Adding a bunch of links to JIRAs for xref to when various things got added. A quick read leaves me with one impression: YARN is incredibly inconsistent and it's attempts to "make things easier" have actually made things harder for both the user and the developer. Worse, a lot of the stuff is completely undocumented outside of JIRAs. I don't know if this situation is salvageable without undoing some of this nonsense.
          Hide
          aw Allen Wittenauer added a comment -

          Question for the crowd. In bin/yarn is... this:

          # for developers, add Hadoop classes to CLASSPATH
          if [ -d "$HADOOP_YARN_HOME/yarn-api/target/classes" ]; then
            CLASSPATH=${CLASSPATH}:$HADOOP_YARN_HOME/yarn-api/target/classes
          fi
          if [ -d "$HADOOP_YARN_HOME/yarn-common/target/classes" ]; then
            CLASSPATH=${CLASSPATH}:$HADOOP_YARN_HOME/yarn-common/target/classes
          fi
          if [ -d "$HADOOP_YARN_HOME/yarn-mapreduce/target/classes" ]; then
            CLASSPATH=${CLASSPATH}:$HADOOP_YARN_HOME/yarn-mapreduce/target/classes
          fi
          if [ -d "$HADOOP_YARN_HOME/yarn-master-worker/target/classes" ]; then
            CLASSPATH=${CLASSPATH}:$HADOOP_YARN_HOME/yarn-master-worker/target/classes
          fi
          if [ -d "$HADOOP_YARN_HOME/yarn-server/yarn-server-nodemanager/target/classes" ]; then
            CLASSPATH=${CLASSPATH}:$HADOOP_YARN_HOME/yarn-server/yarn-server-nodemanager/target/classes
          fi
          if [ -d "$HADOOP_YARN_HOME/yarn-server/yarn-server-common/target/classes" ]; then
            CLASSPATH=${CLASSPATH}:$HADOOP_YARN_HOME/yarn-server/yarn-server-common/target/classes
          fi
          if [ -d "$HADOOP_YARN_HOME/yarn-server/yarn-server-resourcemanager/target/classes" ]; then
            CLASSPATH=${CLASSPATH}:$HADOOP_YARN_HOME/yarn-server/yarn-server-resourcemanager/target/classes
          fi
          if [ -d "$HADOOP_YARN_HOME/build/test/classes" ]; then
            CLASSPATH=${CLASSPATH}:$HADOOP_YARN_HOME/target/test/classes
          fi
          if [ -d "$HADOOP_YARN_HOME/build/tools" ]; then
            CLASSPATH=${CLASSPATH}:$HADOOP_YARN_HOME/build/tools
          fi
          

          [I'm pretty sure at this point in the execution path, the YARN jars from the non-build directories have already been inserted into the classpath via the early call into hadoop-config.sh... which means this code likely isn't working as intended. For now, let's assume that it is.]

          After cleanup, it looks a bit more like this, using the before option to push the entries to the front of the classpath and reversing to maintain the pathing order. [altho I suspect that a) we can this down even further by an ls -d and b) ordering doesn't matter]:

          add_classpath "$HADOOP_YARN_HOME/build/tools" before
          add_classpath "$HADOOP_YARN_HOME/build/test/classes" before
          for debugpath in yarn-server-resourcemanager yarn-server-common yarn-server-nodemanager \
                           yarn-master-worker yarn-mapreduce yarn-common yarn-api; do
            add_classpath "$HADOOP_YARN_HOME/$debugpath/target/classes" before
          done
          

          Since this is buried in bin/yarn, this is only getting set if the yarn command is being used. This might lead to some interesting situations where we're running test yarn code on stable HDFS. This may or may not be desirable. So now the question:

          Should test classpaths always be inserted if we detect them?

          Your choices:

          a) We actually cover this as part of the unit tests. Strip all this stuff out so our commands run faster!
          b) Keep the debug code per-section. i.e., hdfs command will only get hdfs and common test code, yarn command will get the yarn and common test code, hadoop command only gets common.
          c) Everyone gets everything. i.e., using the hdfs command will add in the yarn test code.

          Reminder: hadoop-config.sh adds in all of the classpaths we know about. I don't think this is fixable without breaking compatibility in a major way. (Changing the 'hadoop classpath' command to show all paths is certainly do-able but who knows what else would break...)

          Thoughts?

          Show
          aw Allen Wittenauer added a comment - Question for the crowd. In bin/yarn is... this: # for developers, add Hadoop classes to CLASSPATH if [ -d "$HADOOP_YARN_HOME/yarn-api/target/classes" ]; then CLASSPATH=${CLASSPATH}:$HADOOP_YARN_HOME/yarn-api/target/classes fi if [ -d "$HADOOP_YARN_HOME/yarn-common/target/classes" ]; then CLASSPATH=${CLASSPATH}:$HADOOP_YARN_HOME/yarn-common/target/classes fi if [ -d "$HADOOP_YARN_HOME/yarn-mapreduce/target/classes" ]; then CLASSPATH=${CLASSPATH}:$HADOOP_YARN_HOME/yarn-mapreduce/target/classes fi if [ -d "$HADOOP_YARN_HOME/yarn-master-worker/target/classes" ]; then CLASSPATH=${CLASSPATH}:$HADOOP_YARN_HOME/yarn-master-worker/target/classes fi if [ -d "$HADOOP_YARN_HOME/yarn-server/yarn-server-nodemanager/target/classes" ]; then CLASSPATH=${CLASSPATH}:$HADOOP_YARN_HOME/yarn-server/yarn-server-nodemanager/target/classes fi if [ -d "$HADOOP_YARN_HOME/yarn-server/yarn-server-common/target/classes" ]; then CLASSPATH=${CLASSPATH}:$HADOOP_YARN_HOME/yarn-server/yarn-server-common/target/classes fi if [ -d "$HADOOP_YARN_HOME/yarn-server/yarn-server-resourcemanager/target/classes" ]; then CLASSPATH=${CLASSPATH}:$HADOOP_YARN_HOME/yarn-server/yarn-server-resourcemanager/target/classes fi if [ -d "$HADOOP_YARN_HOME/build/test/classes" ]; then CLASSPATH=${CLASSPATH}:$HADOOP_YARN_HOME/target/test/classes fi if [ -d "$HADOOP_YARN_HOME/build/tools" ]; then CLASSPATH=${CLASSPATH}:$HADOOP_YARN_HOME/build/tools fi [I'm pretty sure at this point in the execution path, the YARN jars from the non-build directories have already been inserted into the classpath via the early call into hadoop-config.sh... which means this code likely isn't working as intended. For now, let's assume that it is.] After cleanup, it looks a bit more like this, using the before option to push the entries to the front of the classpath and reversing to maintain the pathing order. [altho I suspect that a) we can this down even further by an ls -d and b) ordering doesn't matter] : add_classpath "$HADOOP_YARN_HOME/build/tools" before add_classpath "$HADOOP_YARN_HOME/build/test/classes" before for debugpath in yarn-server-resourcemanager yarn-server-common yarn-server-nodemanager \ yarn-master-worker yarn-mapreduce yarn-common yarn-api; do add_classpath "$HADOOP_YARN_HOME/$debugpath/target/classes" before done Since this is buried in bin/yarn, this is only getting set if the yarn command is being used. This might lead to some interesting situations where we're running test yarn code on stable HDFS. This may or may not be desirable. So now the question: Should test classpaths always be inserted if we detect them? Your choices: a) We actually cover this as part of the unit tests. Strip all this stuff out so our commands run faster! b) Keep the debug code per-section. i.e., hdfs command will only get hdfs and common test code, yarn command will get the yarn and common test code, hadoop command only gets common. c) Everyone gets everything. i.e., using the hdfs command will add in the yarn test code. Reminder: hadoop-config.sh adds in all of the classpaths we know about. I don't think this is fixable without breaking compatibility in a major way. (Changing the 'hadoop classpath' command to show all paths is certainly do-able but who knows what else would break...) Thoughts?
          Hide
          aw Allen Wittenauer added a comment -

          or a 4th option:

          d) set HADOOP_BUILD_DEBUG="sub sub ..." which would only enable the classpath for the subprojects listed. (i.e., HADOOP_BUILD_DEBUG="hdfs yarn" would enable both hdfs and yarn but not common.

          Show
          aw Allen Wittenauer added a comment - or a 4th option: d) set HADOOP_BUILD_DEBUG="sub sub ..." which would only enable the classpath for the subprojects listed. (i.e., HADOOP_BUILD_DEBUG="hdfs yarn" would enable both hdfs and yarn but not common.
          Hide
          chris.douglas Chris Douglas added a comment -

          Either (a) or (c) are least-surprising. IMHO: a flag for (c) would be nice, but (a) is sufficient.

          Show
          chris.douglas Chris Douglas added a comment - Either (a) or (c) are least-surprising. IMHO: a flag for (c) would be nice, but (a) is sufficient.
          Hide
          aw Allen Wittenauer added a comment -

          Digging into this further, it looks like YARN has a different build structure than HDFS, common, and mapreduce, which is why these extra classpaths aren't added. I'll see if I can work out what should be added and wrap them around a new flag (--buildpaths).

          Thanks!

          Show
          aw Allen Wittenauer added a comment - Digging into this further, it looks like YARN has a different build structure than HDFS, common, and mapreduce, which is why these extra classpaths aren't added. I'll see if I can work out what should be added and wrap them around a new flag (--buildpaths). Thanks!
          Hide
          aw Allen Wittenauer added a comment -

          Uploaded another mostly untested code drop with contents of bin/ and libexec/ to show progress, get some feedback, etc. Basic stuff does appear to work for me, but I haven't tried starting any daemons yet since I'm still working out the new secure DN starter code to be much more flexible. Plus I'm still working my way through sbin. A few things worth pointing out:

          Load order should be consistent now. Basic path is:

          • bin/command sets HADOOP_NEW_CONFIG to disable auto-population. It then loads:
            • xyz-config.sh
              • hadoop-config.sh
                • hadoop-env.sh
                • hadoop-functions.sh
              • xyz-env.sh <- loading this here should allow for users to override quite a bit more, at least that's the hypothesis
          • (do whatever)
          • finalize <- fills in any missing -D's
          • exec java

          This mainly has implications for YARN which did/does really oddball things with YARN_OPTS. There is bound to be some (edge-case?) breakage here, but (IMO) consistency is more important. I tried to 'make it work', but...

          Misc.

          • users can override functions in hadoop-env.sh. This means if they need extra/replacement functionality, totally doable, without replacing anything in libexec. I might make a specific call out
          • double-dash options (i.e., --config) are handled by the same code, consistently, in hadoop-config.sh. Also, since this is a loop, the order of the options no longer matters, except for --config (for what are hopefully obvious reasons). --help and friends work by having the top level define a function called usage().
          • Most/all of the crazy if/fi constructions (esp those buried inside a case!) have been replaced with a single-parent case statement. Also, an effort has been made to mostly alphabetize the commands in the case statement, although I'm sure I missed one or two.
          • Option C from above has been implemented. I think.
          • I haven't touched httpfs yet at all.
          • You can see some previews of some of the stuff in sbin. For example, slaves.sh now uses pdsh if it is installed.
          • LD_LIBRARY_PATH, CLASSPATH, JAVA_LIBRARY_PATH are now de-duped.
          Show
          aw Allen Wittenauer added a comment - Uploaded another mostly untested code drop with contents of bin/ and libexec/ to show progress, get some feedback, etc. Basic stuff does appear to work for me, but I haven't tried starting any daemons yet since I'm still working out the new secure DN starter code to be much more flexible. Plus I'm still working my way through sbin. A few things worth pointing out: Load order should be consistent now. Basic path is: bin/command sets HADOOP_NEW_CONFIG to disable auto-population. It then loads: xyz-config.sh hadoop-config.sh hadoop-env.sh hadoop-functions.sh xyz-env.sh <- loading this here should allow for users to override quite a bit more, at least that's the hypothesis (do whatever) finalize <- fills in any missing -D's exec java This mainly has implications for YARN which did/does really oddball things with YARN_OPTS. There is bound to be some (edge-case?) breakage here, but (IMO) consistency is more important. I tried to 'make it work', but... Misc. users can override functions in hadoop-env.sh. This means if they need extra/replacement functionality, totally doable, without replacing anything in libexec. I might make a specific call out double-dash options (i.e., --config) are handled by the same code, consistently, in hadoop-config.sh. Also, since this is a loop, the order of the options no longer matters, except for --config (for what are hopefully obvious reasons). --help and friends work by having the top level define a function called usage(). Most/all of the crazy if/fi constructions (esp those buried inside a case!) have been replaced with a single-parent case statement. Also, an effort has been made to mostly alphabetize the commands in the case statement, although I'm sure I missed one or two. Option C from above has been implemented. I think. I haven't touched httpfs yet at all. You can see some previews of some of the stuff in sbin. For example, slaves.sh now uses pdsh if it is installed. LD_LIBRARY_PATH, CLASSPATH, JAVA_LIBRARY_PATH are now de-duped.
          Hide
          aw Allen Wittenauer added a comment -

          Oh, one other thing:

          • removed rm-config/log4j.properties and nm-config/log4j.properties support. These appear to be completely undocumented.
          Show
          aw Allen Wittenauer added a comment - Oh, one other thing: removed rm-config/log4j.properties and nm-config/log4j.properties support. These appear to be completely undocumented.
          Hide
          aw Allen Wittenauer added a comment -

          Would anyone miss any of the following YARN properties being defined:

          • yarn.id.str
          • yarn.home.dir
          • yarn.policy.file

          None of these are used in the Hadoop source and don't appear to be documented.

          Show
          aw Allen Wittenauer added a comment - Would anyone miss any of the following YARN properties being defined: yarn.id.str yarn.home.dir yarn.policy.file None of these are used in the Hadoop source and don't appear to be documented.
          Hide
          aw Allen Wittenauer added a comment -

          Since I'm getting ready to post a patch, how about an 'end result' example! Here is the comamnd line for the resource manager from my real, 100+ node test grid.

          Before the changes:

          /usr/java/default/bin/java
          -Dproc_resourcemanager
          -Xmx1000m
          -Xmx24g
          -Dyarn.server.resourcemanager.appsummary.log.file=rm-appsummary.log
          -Dyarn.server.resourcemanager.appsummary.logger=INFO,RMSUMMARY
          -Xloggc:/export/apps/hadoop/logs/gc-nn.log-201308261726
          -Dcom.sun.management.jmxremote.port=9010
          -verbose:gc
          -XX:+PrintGCDetails
          -XX:+PrintGCTimeStamps
          -XX:+PrintGCDateStamps
          -Dcom.sun.management.jmxremote=true
          -Dcom.sun.management.jmxremote.authenticate=false
          -Dcom.sun.management.jmxremote.ssl=false
          -Dhadoop.log.dir=/export/apps/hadoop/logs
          -Dyarn.log.dir=/export/apps/hadoop/logs
          -Dhadoop.log.file=yarn-yarn-resourcemanager-eat1-hcl4083.grid.linkedin.com.log
          -Dyarn.log.file=yarn-yarn-resourcemanager-eat1-hcl4083.grid.linkedin.com.log
          -Dyarn.home.dir=
          -Dyarn.id.str=yarn
          -Dhadoop.root.logger=INFO,DRFA
          -Dyarn.root.logger=INFO,DRFA
          -Djava.library.path=/export/apps/hadoop/latest/lib/native
          -Dyarn.policy.file=hadoop-policy.xml
          -Dhadoop.log.dir=/export/apps/hadoop/logs
          -Dyarn.log.dir=/export/apps/hadoop/logs
          -Dhadoop.log.file=yarn-yarn-resourcemanager-eat1-hcl4083.grid.linkedin.com.log
          -Dyarn.log.file=yarn-yarn-resourcemanager-eat1-hcl4083.grid.linkedin.com.log
          -Dyarn.home.dir=/export/apps/hadoop/latest
          -Dhadoop.home.dir=/export/apps/hadoop/latest
          -Dhadoop.root.logger=INFO,DRFA
          -Dyarn.root.logger=INFO,DRFA
          -Djava.library.path=/export/apps/hadoop/latest/lib/native
          -classpath
          /export/apps/hadoop/site/etc/hadoop
          /export/apps/hadoop/site/etc/hadoop
          /export/apps/hadoop/site/etc/hadoop
          /export/apps/hadoop/latest/share/hadoop/common/lib/*
          /export/apps/hadoop/latest/share/hadoop/common/*
          /export/apps/hadoop/latest/share/hadoop/hdfs
          /export/apps/hadoop/latest/share/hadoop/hdfs/lib/*
          /export/apps/hadoop/latest/share/hadoop/hdfs/*
          /export/apps/hadoop/latest/share/hadoop/yarn/lib/*
          /export/apps/hadoop/latest/share/hadoop/yarn/*
          /export/apps/hadoop/latest/share/hadoop/mapreduce/lib/*
          /export/apps/hadoop/latest/share/hadoop/mapreduce/*
          /export/apps/hadoop/site/lib/grid-topology-1.0.jar
          /export/apps/hadoop/latest/contrib/capacity-scheduler/*.jar
          /export/apps/hadoop/site/lib/grid-topology-1.0.jar
          /export/apps/hadoop/latest/contrib/capacity-scheduler/*.jar
          /export/apps/hadoop/site/lib/grid-topology-1.0.jar
          /export/apps/hadoop/latest/contrib/capacity-scheduler/*.jar
          /export/apps/hadoop/latest/share/hadoop/yarn/*
          /export/apps/hadoop/latest/share/hadoop/yarn/lib/*
          /export/apps/hadoop/site/etc/hadoop/rm-config/log4j.properties
          org.apache.hadoop.yarn.server.resourcemanager.ResourceManager
          

          After the changes:

          /usr/java/default/bin/java
          -Dproc_resourcemanager
          -Xloggc:/export/apps/hadoop/logs/gc-nn.log-201309162014
          -Dcom.sun.management.jmxremote.port=9010
          -verbose:gc
          -XX:+PrintGCDetails
          -XX:+PrintGCTimeStamps
          -XX:+PrintGCDateStamps
          -Dcom.sun.management.jmxremote=true
          -Dcom.sun.management.jmxremote.authenticate=false
          -Dcom.sun.management.jmxremote.ssl=false
          -Xmx24g
          -Dyarn.server.resourcemanager.appsummary.log.file=rm-appsummary.log
          -Dyarn.server.resourcemanager.appsummary.logger=INFO,RMSUMMARY
          -Dyarn.log.dir=/export/apps/hadoop/logs
          -Dyarn.log.file=yarn-yarn-resourcemanager-eat1-hcl4083.grid.linkedin.com.log
          -Dyarn.home.dir=/export/apps/hadoop/latest
          -Dyarn.root.logger=INFO,DRFA
          -Djava.library.path=/export/apps/hadoop/latest/lib/native
          -Dhadoop.log.dir=/export/apps/hadoop/logs
          -Dhadoop.log.file=yarn-yarn-resourcemanager-eat1-hcl4083.grid.linkedin.com.log
          -Dhadoop.home.dir=/export/apps/hadoop/latest
          -Dhadoop.id.str=yarn
          -Dhadoop.root.logger=INFO,DRFA
          -Dhadoop.policy.file=hadoop-policy.xml
          -Dhadoop.security.logger=INFO,NullAppender
          -Djava.net.preferIPv4Stack=true
          -classpath
          /export/apps/hadoop/site/lib/grid-topology-1.0.jar
          /export/apps/hadoop/latest/contrib/capacity-scheduler/*.jar
          /export/apps/hadoop/site/etc/hadoop
          /export/apps/hadoop/latest/share/hadoop/common/lib/*
          /export/apps/hadoop/latest/share/hadoop/common/*
          /export/apps/hadoop/latest/share/hadoop/hdfs
          /export/apps/hadoop/latest/share/hadoop/hdfs/lib/*
          /export/apps/hadoop/latest/share/hadoop/hdfs/*
          /export/apps/hadoop/latest/share/hadoop/yarn/lib/*
          /export/apps/hadoop/latest/share/hadoop/yarn/*
          /export/apps/hadoop/latest/share/hadoop/mapreduce/lib/*
          /export/apps/hadoop/latest/share/hadoop/mapreduce/*
          org.apache.hadoop.yarn.server.resourcemanager.ResourceManager
          

          2500 bytes vs. 1750 bytes, almost all the savings are from the classpath.

          There are still a few problems with the 'after' output but... they are mainly from my local config and not coming from the scripts.

          Show
          aw Allen Wittenauer added a comment - Since I'm getting ready to post a patch, how about an 'end result' example! Here is the comamnd line for the resource manager from my real, 100+ node test grid. Before the changes: /usr/java/ default /bin/java -Dproc_resourcemanager -Xmx1000m -Xmx24g -Dyarn.server.resourcemanager.appsummary.log.file=rm-appsummary.log -Dyarn.server.resourcemanager.appsummary.logger=INFO,RMSUMMARY -Xloggc:/export/apps/hadoop/logs/gc-nn.log-201308261726 -Dcom.sun.management.jmxremote.port=9010 -verbose:gc -XX:+PrintGCDetails -XX:+PrintGCTimeStamps -XX:+PrintGCDateStamps -Dcom.sun.management.jmxremote= true -Dcom.sun.management.jmxremote.authenticate= false -Dcom.sun.management.jmxremote.ssl= false -Dhadoop.log.dir=/export/apps/hadoop/logs -Dyarn.log.dir=/export/apps/hadoop/logs -Dhadoop.log.file=yarn-yarn-resourcemanager-eat1-hcl4083.grid.linkedin.com.log -Dyarn.log.file=yarn-yarn-resourcemanager-eat1-hcl4083.grid.linkedin.com.log -Dyarn.home.dir= -Dyarn.id.str=yarn -Dhadoop.root.logger=INFO,DRFA -Dyarn.root.logger=INFO,DRFA -Djava.library.path=/export/apps/hadoop/latest/lib/ native -Dyarn.policy.file=hadoop-policy.xml -Dhadoop.log.dir=/export/apps/hadoop/logs -Dyarn.log.dir=/export/apps/hadoop/logs -Dhadoop.log.file=yarn-yarn-resourcemanager-eat1-hcl4083.grid.linkedin.com.log -Dyarn.log.file=yarn-yarn-resourcemanager-eat1-hcl4083.grid.linkedin.com.log -Dyarn.home.dir=/export/apps/hadoop/latest -Dhadoop.home.dir=/export/apps/hadoop/latest -Dhadoop.root.logger=INFO,DRFA -Dyarn.root.logger=INFO,DRFA -Djava.library.path=/export/apps/hadoop/latest/lib/ native -classpath /export/apps/hadoop/site/etc/hadoop /export/apps/hadoop/site/etc/hadoop /export/apps/hadoop/site/etc/hadoop /export/apps/hadoop/latest/share/hadoop/common/lib/* /export/apps/hadoop/latest/share/hadoop/common/* /export/apps/hadoop/latest/share/hadoop/hdfs /export/apps/hadoop/latest/share/hadoop/hdfs/lib/* /export/apps/hadoop/latest/share/hadoop/hdfs/* /export/apps/hadoop/latest/share/hadoop/yarn/lib/* /export/apps/hadoop/latest/share/hadoop/yarn/* /export/apps/hadoop/latest/share/hadoop/mapreduce/lib/* /export/apps/hadoop/latest/share/hadoop/mapreduce/* /export/apps/hadoop/site/lib/grid-topology-1.0.jar /export/apps/hadoop/latest/contrib/capacity-scheduler/*.jar /export/apps/hadoop/site/lib/grid-topology-1.0.jar /export/apps/hadoop/latest/contrib/capacity-scheduler/*.jar /export/apps/hadoop/site/lib/grid-topology-1.0.jar /export/apps/hadoop/latest/contrib/capacity-scheduler/*.jar /export/apps/hadoop/latest/share/hadoop/yarn/* /export/apps/hadoop/latest/share/hadoop/yarn/lib/* /export/apps/hadoop/site/etc/hadoop/rm-config/log4j.properties org.apache.hadoop.yarn.server.resourcemanager.ResourceManager After the changes: /usr/java/ default /bin/java -Dproc_resourcemanager -Xloggc:/export/apps/hadoop/logs/gc-nn.log-201309162014 -Dcom.sun.management.jmxremote.port=9010 -verbose:gc -XX:+PrintGCDetails -XX:+PrintGCTimeStamps -XX:+PrintGCDateStamps -Dcom.sun.management.jmxremote= true -Dcom.sun.management.jmxremote.authenticate= false -Dcom.sun.management.jmxremote.ssl= false -Xmx24g -Dyarn.server.resourcemanager.appsummary.log.file=rm-appsummary.log -Dyarn.server.resourcemanager.appsummary.logger=INFO,RMSUMMARY -Dyarn.log.dir=/export/apps/hadoop/logs -Dyarn.log.file=yarn-yarn-resourcemanager-eat1-hcl4083.grid.linkedin.com.log -Dyarn.home.dir=/export/apps/hadoop/latest -Dyarn.root.logger=INFO,DRFA -Djava.library.path=/export/apps/hadoop/latest/lib/ native -Dhadoop.log.dir=/export/apps/hadoop/logs -Dhadoop.log.file=yarn-yarn-resourcemanager-eat1-hcl4083.grid.linkedin.com.log -Dhadoop.home.dir=/export/apps/hadoop/latest -Dhadoop.id.str=yarn -Dhadoop.root.logger=INFO,DRFA -Dhadoop.policy.file=hadoop-policy.xml -Dhadoop.security.logger=INFO,NullAppender -Djava.net.preferIPv4Stack= true -classpath /export/apps/hadoop/site/lib/grid-topology-1.0.jar /export/apps/hadoop/latest/contrib/capacity-scheduler/*.jar /export/apps/hadoop/site/etc/hadoop /export/apps/hadoop/latest/share/hadoop/common/lib/* /export/apps/hadoop/latest/share/hadoop/common/* /export/apps/hadoop/latest/share/hadoop/hdfs /export/apps/hadoop/latest/share/hadoop/hdfs/lib/* /export/apps/hadoop/latest/share/hadoop/hdfs/* /export/apps/hadoop/latest/share/hadoop/yarn/lib/* /export/apps/hadoop/latest/share/hadoop/yarn/* /export/apps/hadoop/latest/share/hadoop/mapreduce/lib/* /export/apps/hadoop/latest/share/hadoop/mapreduce/* org.apache.hadoop.yarn.server.resourcemanager.ResourceManager 2500 bytes vs. 1750 bytes, almost all the savings are from the classpath. There are still a few problems with the 'after' output but... they are mainly from my local config and not coming from the scripts.
          Hide
          aw Allen Wittenauer added a comment -

          Removed the tarball. Added a patch.

          This still needs a lot of testing and some of the features aren't quite complete (start-dfs.sh firing off secure datanodes, for example).

          httpfs hasn't been touched.

          Show
          aw Allen Wittenauer added a comment - Removed the tarball. Added a patch. This still needs a lot of testing and some of the features aren't quite complete (start-dfs.sh firing off secure datanodes, for example). httpfs hasn't been touched.
          Hide
          stevel@apache.org Steve Loughran added a comment -

          playing with this. sometimes the generated classpath is , say. share/hadoop/yarn/* ; the capacity scheduler is /*.jar -should everything be consistent.

          My tarball built with {{mvn clean package -Pdist -Dtar -DskipTests -Dmaven.javadoc.skip=true }} doesn't seem to contain any conf/ directories, which is presumably unrelated to this build setup. What I do wonder is whether the scripts should care about this fact -and how to react

          Show
          stevel@apache.org Steve Loughran added a comment - playing with this. sometimes the generated classpath is , say. share/hadoop/yarn/* ; the capacity scheduler is /*.jar -should everything be consistent. My tarball built with {{mvn clean package -Pdist -Dtar -DskipTests -Dmaven.javadoc.skip=true }} doesn't seem to contain any conf/ directories, which is presumably unrelated to this build setup. What I do wonder is whether the scripts should care about this fact -and how to react
          Hide
          stevel@apache.org Steve Loughran added a comment -

          (ignore that last comment about the conf dirs, they are there. But we do need to think when and how to react to their absence)

          Show
          stevel@apache.org Steve Loughran added a comment - (ignore that last comment about the conf dirs, they are there. But we do need to think when and how to react to their absence)
          Hide
          stevel@apache.org Steve Loughran added a comment -

          One thing I will note is that yarn classpath fails saying no conf dir set

          $ yarn classpath
          No HADOOP_CONF_DIR set.
          Please specify it either in yarn-env.sh or in the environment.
          

          the normal hadoop classpath doesn't fail in the same situation.

          Show
          stevel@apache.org Steve Loughran added a comment - One thing I will note is that yarn classpath fails saying no conf dir set $ yarn classpath No HADOOP_CONF_DIR set. Please specify it either in yarn-env.sh or in the environment. the normal hadoop classpath doesn't fail in the same situation.
          Hide
          stevel@apache.org Steve Loughran added a comment -

          -actually a rebuild fixes that. What I did have to do was drop hadoop-functions.sh into libexec

          I don't see hadoop tools getting on the CP: is there a plan for that? Because it would suit me to have a directory into which I could put things to get them on a classpath without playing with HADOOP_CLASSPATH

          Show
          stevel@apache.org Steve Loughran added a comment - -actually a rebuild fixes that. What I did have to do was drop hadoop-functions.sh into libexec I don't see hadoop tools getting on the CP: is there a plan for that? Because it would suit me to have a directory into which I could put things to get them on a classpath without playing with HADOOP_CLASSPATH
          Hide
          aw Allen Wittenauer added a comment -

          These are sort of out of order.

          playing with this. sometimes the generated classpath is , say. share/hadoop/yarn/* ; the capacity scheduler is /*.jar -should everything be consistent.

          At one point I thought about processing the regex string to dedupe it down to the jar level. This opens up a big can of worms, however: if you hit two of them, do you always take the latest? What does latest mean anyway (date or version)? Will we be able to parse the version out of the filename? How do we deal with user overrides? Still take the latest no matter what?

          I've opted to basically let the classpath as it is passed to us stand. Currently the dedupe code is pretty fast for interpreted shell. The only sub-optimization that I might be tempted to do is to normalize any symlinks and relative paths. There is a good chance we'll catch a few dupes this way... but it likely isn't worth the extra execution time.

          It's worth pointing out that a user can feasibly replace the add_classpath code in hadoop-env.sh to override the functionality without changing the base Apache code if they want/need more advanced classpath handling. (e.g., HADOOP-6997 seems to be a non-issue to me since passing duplicate class names is just bad practice; changing the collation is fixing a symptom of a much bigger/dangerous problem. But someone facing this issue could theoretically fix a collation problem on their own, "legally" in a stable way using this trick.)

          I don't see hadoop tools getting on the CP: is there a plan for that?

          Tools path gets added as needed. I seem to recall this is exactly the same way in the current shell scripts.

          Because it would suit me to have a directory into which I could put things to get them on a classpath without playing with HADOOP_CLASSPATH

          I was planning on bringing up this exact issue after I get this one committed. It's a harder discussion because the placement is tricky and there are a lot of options to make this functionality happen. Do we add another env var? Do we just auto-prepend $HADOOP_PREFIX/lib/share/site/*? Do we offer both prepend and append options? etc etc. All have pro's and con's. Some of the choices become feasible really only after this is committed, however.

          we do need to think when and how to react to (conf dir) absence

          Good point. That's pretty easy to add given that the conf dir handling is fairly well contained now in the hadoop_find_confdir function in hadoop-functions.sh. It's pretty trivial to throw a fatal error if we don't detect, say, hadoop-env.sh in what we resolved HADOOP_CONF_DIR to. Suggestions on what to check for?

          actually a rebuild fixes that. What I did have to do was drop hadoop-functions.sh into libexec

          Yeah, after commit this is pretty much a flag day for all of the Hadoop subprojects. I talked to a few folks about it and it was generally felt that this should be one big patch+JIRA rather than several smaller ones per project given the interdependency on common. We'll have to advertise on the various -dev mailing lists post commit to say do a full rebuild. Hopefully folks won't have to change their *-env.sh files and they will continue without modification, however.

          Thanks!

          Show
          aw Allen Wittenauer added a comment - These are sort of out of order. playing with this. sometimes the generated classpath is , say. share/hadoop/yarn/* ; the capacity scheduler is /*.jar -should everything be consistent. At one point I thought about processing the regex string to dedupe it down to the jar level. This opens up a big can of worms, however: if you hit two of them, do you always take the latest? What does latest mean anyway (date or version)? Will we be able to parse the version out of the filename? How do we deal with user overrides? Still take the latest no matter what? I've opted to basically let the classpath as it is passed to us stand. Currently the dedupe code is pretty fast for interpreted shell. The only sub-optimization that I might be tempted to do is to normalize any symlinks and relative paths. There is a good chance we'll catch a few dupes this way... but it likely isn't worth the extra execution time. It's worth pointing out that a user can feasibly replace the add_classpath code in hadoop-env.sh to override the functionality without changing the base Apache code if they want/need more advanced classpath handling. (e.g., HADOOP-6997 seems to be a non-issue to me since passing duplicate class names is just bad practice; changing the collation is fixing a symptom of a much bigger/dangerous problem. But someone facing this issue could theoretically fix a collation problem on their own, "legally" in a stable way using this trick.) I don't see hadoop tools getting on the CP: is there a plan for that? Tools path gets added as needed. I seem to recall this is exactly the same way in the current shell scripts. Because it would suit me to have a directory into which I could put things to get them on a classpath without playing with HADOOP_CLASSPATH I was planning on bringing up this exact issue after I get this one committed. It's a harder discussion because the placement is tricky and there are a lot of options to make this functionality happen. Do we add another env var? Do we just auto-prepend $HADOOP_PREFIX/lib/share/site/*? Do we offer both prepend and append options? etc etc. All have pro's and con's. Some of the choices become feasible really only after this is committed, however. we do need to think when and how to react to (conf dir) absence Good point. That's pretty easy to add given that the conf dir handling is fairly well contained now in the hadoop_find_confdir function in hadoop-functions.sh. It's pretty trivial to throw a fatal error if we don't detect, say, hadoop-env.sh in what we resolved HADOOP_CONF_DIR to. Suggestions on what to check for? actually a rebuild fixes that. What I did have to do was drop hadoop-functions.sh into libexec Yeah, after commit this is pretty much a flag day for all of the Hadoop subprojects. I talked to a few folks about it and it was generally felt that this should be one big patch+JIRA rather than several smaller ones per project given the interdependency on common. We'll have to advertise on the various -dev mailing lists post commit to say do a full rebuild. Hopefully folks won't have to change their *-env.sh files and they will continue without modification, however. Thanks!
          Hide
          stevel@apache.org Steve Loughran added a comment -

          I'd be -1 for any attempt in the shell scripts to try and fix up duplicate JARs.

          1. inconsistent operation across platforms
          2. determining which version is latest is non-trivial
          3. doesn't address the lurker of duplicate classes & resources (log4j, hdfs-site) on the CP

          the JAR has a policy of handling duplicates: first entry first. There is also the option of sealing JARs without actually signing them -at which point you get an error when >1 JAR tries to declare classes in the same package. We could seal the hadoop JARs, but it'd raise more support calls.

          What we can do is make sure that there are

          • No duplicate JARs get into the default tarball; YARN/lib extends HDFS/lib extends HADOOP_CORE/lib without conflict.
          • No needless JARs (ex, junit 4.10 in YARN/lib)
          Show
          stevel@apache.org Steve Loughran added a comment - I'd be -1 for any attempt in the shell scripts to try and fix up duplicate JARs. inconsistent operation across platforms determining which version is latest is non-trivial doesn't address the lurker of duplicate classes & resources (log4j, hdfs-site) on the CP the JAR has a policy of handling duplicates: first entry first. There is also the option of sealing JARs without actually signing them -at which point you get an error when >1 JAR tries to declare classes in the same package. We could seal the hadoop JARs, but it'd raise more support calls. What we can do is make sure that there are No duplicate JARs get into the default tarball; YARN/lib extends HDFS/lib extends HADOOP_CORE/lib without conflict. No needless JARs (ex, junit 4.10 in YARN/lib)
          Hide
          aw Allen Wittenauer added a comment -

          Agreed. The edge cases are too painful.

          The only dupe jar detection that occurs now is some extremely simple string match. So if someone does something like $DIR/lib/blah.jar and $DIR/lib/../lib/blah.jar, it won't get deduped. (It does, however, verify that $DIR/lib and $DIR/lib/../lib exists!) Even with just this simple stuff, it eliminates multiple instances of the conf dir at a minimum.

          Show
          aw Allen Wittenauer added a comment - Agreed. The edge cases are too painful. The only dupe jar detection that occurs now is some extremely simple string match. So if someone does something like $DIR/lib/blah.jar and $DIR/lib/../lib/blah.jar, it won't get deduped. (It does, however, verify that $DIR/lib and $DIR/lib/../lib exists!) Even with just this simple stuff, it eliminates multiple instances of the conf dir at a minimum.
          Hide
          aw Allen Wittenauer added a comment -

          Adding a link to HADOOP-10177 and HDFS-4763 to include the changes added by those patches.

          (It should be noted that neither patch listed included CLI help info for the new sub-commands they added...)

          Show
          aw Allen Wittenauer added a comment - Adding a link to HADOOP-10177 and HDFS-4763 to include the changes added by those patches. (It should be noted that neither patch listed included CLI help info for the new sub-commands they added...)
          Hide
          aw Allen Wittenauer added a comment -

          Would anyone be too upset for a patch to trunk that removed the 'deprecated' status? i.e., no longer warning, etc? It'll have been in a release that we no longer support the HDFS and MR sub-commands.

          Show
          aw Allen Wittenauer added a comment - Would anyone be too upset for a patch to trunk that removed the 'deprecated' status? i.e., no longer warning, etc? It'll have been in a release that we no longer support the HDFS and MR sub-commands.
          Hide
          stevel@apache.org Steve Loughran added a comment -

          you are proposing this patch without the deprecation checks?

          Certainly I'd like to see the scripts improved, and you are the only person working full time on this

          Show
          stevel@apache.org Steve Loughran added a comment - you are proposing this patch without the deprecation checks? Certainly I'd like to see the scripts improved, and you are the only person working full time on this
          Hide
          aw Allen Wittenauer added a comment -

          At least for trunk (hadoop 3.x), yes, remove the deprecation checks. For any backport, the dep checks would need to stay (obviously).

          (I've noticed that people are adding sub-commands to the deprecation checks that were never in hadoop 1.x and earlier.)

          Show
          aw Allen Wittenauer added a comment - At least for trunk (hadoop 3.x), yes, remove the deprecation checks. For any backport, the dep checks would need to stay (obviously). (I've noticed that people are adding sub-commands to the deprecation checks that were never in hadoop 1.x and earlier.)
          Hide
          aw Allen Wittenauer added a comment -

          Note: there are dragons!

          distributed-exclude.sh is in the hdfs tree but doesn't appear to actually being included in the distribution.

          Show
          aw Allen Wittenauer added a comment - Note: there are dragons! distributed-exclude.sh is in the hdfs tree but doesn't appear to actually being included in the distribution.
          Hide
          aw Allen Wittenauer added a comment -

          For trunk.

          Show
          aw Allen Wittenauer added a comment - For trunk.
          Hide
          aw Allen Wittenauer added a comment -

          Let's talk about the 'classpath' subcommand.

          Today, hadoop classpath returns the classpath of common, hdfs, and yarn. To me, this seems to be the wrong behavior for a two major reasons:

          • Common has to have to knowledge about subsystems that rely upon it. Ultimately, this is a reverse dependency and (I hope) we can all agree those are bad.
          • If I'm building an application that only needs access to (common|hdfs|yarn|mapreduce), my classpath is polluted with extra garbage from the other subproject(s) that may or may not need. (yarn does offer a classpath subcommand but it's essentially the same thing as the hadoop classpath. The mapreduce classpath is... yeah...)

          On the plus side, it's one stop shopping. "Hooray! I get everything!", some developer likely said somewhere.

          So I'd like to throw out a proposal.

          I want to re-implement the classpath subcommand such that (hadoop|hdfs|yarn) only return the base classpath for their project. This is (obviously) an incompatible change. Someone who wanted to know what all the classpaths were for all the projects would be required to run all the commands.

          To make up for it, however, I believe I can easily introduce a classpath subcommand for every command that uses the common framework. For the non-major commands, I suspect this would be a massive win for debugging. "What the heck is start-dfs.sh using when it fires up the namenode?", said myself many many times but using more curse words, some of which you might not have heard before.

          Another choice might be to have some tricky logic to have subprojects 'register' into the main project on install such that commands like 'hadoop classpath' now know about those subprojects. It won't solve the second bullet point, but it does fix the first.

          Thoughts?

          Show
          aw Allen Wittenauer added a comment - Let's talk about the 'classpath' subcommand. Today, hadoop classpath returns the classpath of common, hdfs, and yarn. To me, this seems to be the wrong behavior for a two major reasons: Common has to have to knowledge about subsystems that rely upon it. Ultimately, this is a reverse dependency and (I hope) we can all agree those are bad. If I'm building an application that only needs access to (common|hdfs|yarn|mapreduce), my classpath is polluted with extra garbage from the other subproject(s) that may or may not need. (yarn does offer a classpath subcommand but it's essentially the same thing as the hadoop classpath. The mapreduce classpath is... yeah...) On the plus side, it's one stop shopping. "Hooray! I get everything!", some developer likely said somewhere. So I'd like to throw out a proposal. I want to re-implement the classpath subcommand such that (hadoop|hdfs|yarn) only return the base classpath for their project. This is (obviously) an incompatible change. Someone who wanted to know what all the classpaths were for all the projects would be required to run all the commands. To make up for it, however, I believe I can easily introduce a classpath subcommand for every command that uses the common framework. For the non-major commands, I suspect this would be a massive win for debugging. "What the heck is start-dfs.sh using when it fires up the namenode?", said myself many many times but using more curse words, some of which you might not have heard before. Another choice might be to have some tricky logic to have subprojects 'register' into the main project on install such that commands like 'hadoop classpath' now know about those subprojects. It won't solve the second bullet point, but it does fix the first. Thoughts?
          Hide
          stevel@apache.org Steve Loughran added a comment -

          its probably too late to change hadoop classpath. We could perhaps include some --strict param that restricts things.

          Show
          stevel@apache.org Steve Loughran added a comment - its probably too late to change hadoop classpath . We could perhaps include some --strict param that restricts things.
          Hide
          stevel@apache.org Steve Loughran added a comment -

          while we are at it, HADOOP-9044 is an (unapplied, may need to be rebuilt) entry point to locate a class/resource on the CP & print it out, also useful for diagnostics. After this patch goes in, we could apply that and make it a named operation

          Show
          stevel@apache.org Steve Loughran added a comment - while we are at it, HADOOP-9044 is an (unapplied, may need to be rebuilt) entry point to locate a class/resource on the CP & print it out, also useful for diagnostics. After this patch goes in, we could apply that and make it a named operation
          Hide
          aw Allen Wittenauer added a comment -

          YARN-1429 has (unintentionally) added a lot of complexity with questionable benefits likely due to the parties involved not realizing that the functionality they seek is already there. In any case, with this change, we have this situation:

          HADOOP_USER_CLASSPATH           =>   CLASSPATH:HADOOP_USER_CLASSPATH
          
          YARN_USER_CLASSPATH             =>   CLASSPATH:YARN_USER_CLASSPATH
          
          HADOOP_USER_CLASSPATH &         =>   CLASSPATH:HADOOP_USER_CLASSPATH:YARN_USER_CLASSPATH
          YARN_USER_CLASSSPATH
          
          HADOOP_USER_CLASSPATH_FIRST     =>   HADOOP_USER_CLASSPATH:CLASSPATH:YARN_USER_CLASSPATH
          
          YARN_USER_CLASSPATH_FIRST       =>   YARN_USER_CLASSPATH:CLASSPATH:HADOOP_USER_CLASSPATH
          
          HADOOP_USER_CLASSPATH_FIRST &   =>   YARN_USER_CLASSPATH:HADOOP_USER_CLASSPATH:CLASSPATH
          YARN_USER_CLASSPATH_FIRST
          

          In the case of the other YARN_xxx dupes, the new code causes an override for YARN apps. In order to keep the consistency with the rest of the system, we should probably keep the same logic... essentially, YARN_USER_CLASSPATH will override HADOOP_USER_CLASSPATH entirely when running 'yarn xxx'.

          Ideally, we'd just back out YARN-1429 though and inform users of HADOOP_USER_CLASSPATH.

          Show
          aw Allen Wittenauer added a comment - YARN-1429 has (unintentionally) added a lot of complexity with questionable benefits likely due to the parties involved not realizing that the functionality they seek is already there. In any case, with this change, we have this situation: HADOOP_USER_CLASSPATH => CLASSPATH:HADOOP_USER_CLASSPATH YARN_USER_CLASSPATH => CLASSPATH:YARN_USER_CLASSPATH HADOOP_USER_CLASSPATH & => CLASSPATH:HADOOP_USER_CLASSPATH:YARN_USER_CLASSPATH YARN_USER_CLASSSPATH HADOOP_USER_CLASSPATH_FIRST => HADOOP_USER_CLASSPATH:CLASSPATH:YARN_USER_CLASSPATH YARN_USER_CLASSPATH_FIRST => YARN_USER_CLASSPATH:CLASSPATH:HADOOP_USER_CLASSPATH HADOOP_USER_CLASSPATH_FIRST & => YARN_USER_CLASSPATH:HADOOP_USER_CLASSPATH:CLASSPATH YARN_USER_CLASSPATH_FIRST In the case of the other YARN_xxx dupes, the new code causes an override for YARN apps. In order to keep the consistency with the rest of the system, we should probably keep the same logic... essentially, YARN_USER_CLASSPATH will override HADOOP_USER_CLASSPATH entirely when running 'yarn xxx'. Ideally, we'd just back out YARN-1429 though and inform users of HADOOP_USER_CLASSPATH.
          Hide
          aw Allen Wittenauer added a comment -
            CLASSPATH=${CLASSPATH}:$YARN_CONF_DIR/ahs-config/log4j.properties
          ...
            CLASSPATH=${CLASSPATH}:$YARN_CONF_DIR/timelineserver-config/log4j.properties
          

          The timeline server made more custom (and likely equally undocumented) log4j.properties locations. Needless to say, that's going away too just like their rm-config and nm-config brethren.

          Show
          aw Allen Wittenauer added a comment - CLASSPATH=${CLASSPATH}:$YARN_CONF_DIR/ahs-config/log4j.properties ... CLASSPATH=${CLASSPATH}:$YARN_CONF_DIR/timelineserver-config/log4j.properties The timeline server made more custom (and likely equally undocumented) log4j.properties locations. Needless to say, that's going away too just like their rm-config and nm-config brethren.
          Hide
          aw Allen Wittenauer added a comment -

          Ran across an interesting discrepancy. hadoop-env.sh says:

          # A string representing this instance of hadoop. $USER by default.
          export HADOOP_IDENT_STRING=$USER
          

          This implies that could be something that isn't a user. However...

            chown $HADOOP_IDENT_STRING $HADOOP_LOG_DIR
          

          ... we clearly have that assumption. Since the chown has already been removed from the new code, this problem goes away. But should we explicitly state that HADOOP_IDENT_STRING needs to be a user? Is anyone aware of anything else that uses this outside of the Hadoop shell scripts?

          Show
          aw Allen Wittenauer added a comment - Ran across an interesting discrepancy. hadoop-env.sh says: # A string representing this instance of hadoop. $USER by default . export HADOOP_IDENT_STRING=$USER This implies that could be something that isn't a user. However... chown $HADOOP_IDENT_STRING $HADOOP_LOG_DIR ... we clearly have that assumption. Since the chown has already been removed from the new code, this problem goes away. But should we explicitly state that HADOOP_IDENT_STRING needs to be a user? Is anyone aware of anything else that uses this outside of the Hadoop shell scripts?
          Hide
          mgrover Mark Grover added a comment -

          Hi Alan,
          Good point. In Bigtop, where we create RPM and DEB packages for hadoop and bundle it into our Bigtop distribution, we do rely on this property.
          And, looking at the code, it looks like we set that to be a user (hdfs user in our case).

          Here are the references:
          These get used in the scripts we deploy using puppet for our integration testing:
          https://github.com/apache/bigtop/blob/master/bigtop-deploy/puppet/modules/hadoop/templates/hadoop-env.sh#L78
          https://github.com/apache/bigtop/blob/master/bigtop-deploy/puppet/modules/hadoop/templates/hadoop-hdfs#L20

          This gets used in the default configuration for our secure clusters for integration testing:
          https://github.com/apache/bigtop/blob/master/bigtop-packages/src/common/hadoop/conf.secure/hadoop-env.sh#L56

          This gets used in the init script that starts the datanode services:
          https://github.com/apache/bigtop/blob/master/bigtop-packages/src/common/hadoop/hadoop-hdfs-datanode.svc#L39

          And, this gets used to set certain environment variables before starting various HDFS services:
          https://github.com/apache/bigtop/blob/master/bigtop-packages/src/common/hadoop/hdfs.default#L20

          Hope that helps but please let me know if you need any further info.

          Show
          mgrover Mark Grover added a comment - Hi Alan, Good point. In Bigtop, where we create RPM and DEB packages for hadoop and bundle it into our Bigtop distribution, we do rely on this property. And, looking at the code, it looks like we set that to be a user (hdfs user in our case). Here are the references: These get used in the scripts we deploy using puppet for our integration testing: https://github.com/apache/bigtop/blob/master/bigtop-deploy/puppet/modules/hadoop/templates/hadoop-env.sh#L78 https://github.com/apache/bigtop/blob/master/bigtop-deploy/puppet/modules/hadoop/templates/hadoop-hdfs#L20 This gets used in the default configuration for our secure clusters for integration testing: https://github.com/apache/bigtop/blob/master/bigtop-packages/src/common/hadoop/conf.secure/hadoop-env.sh#L56 This gets used in the init script that starts the datanode services: https://github.com/apache/bigtop/blob/master/bigtop-packages/src/common/hadoop/hadoop-hdfs-datanode.svc#L39 And, this gets used to set certain environment variables before starting various HDFS services: https://github.com/apache/bigtop/blob/master/bigtop-packages/src/common/hadoop/hdfs.default#L20 Hope that helps but please let me know if you need any further info.
          Hide
          aw Allen Wittenauer added a comment -

          That's very helpful! (Especially since that was going to be the next place I looked since I just happen to have it cloned from git on my dev machine.... it's going to be one of the first big tests I do as I work towards a commit-able patch. )

          Bigtop looks like it is doing what I would expect: setting it for Hadoop, but not using it directly. Which seems to indicate that, at least as far as Bigtop is concerned, we could expand the definition beyond "it must be a user".

          Hadoop also uses HADOOP_IDENT_STR as the setting for the Java hadoop.id.str property. But I can't find a single place where this property is used. IIRC, it was used in ancient times for logging and/or display, but if we don't need the property set anymore because we've gotten wiser, I'd like to just yank that property completely.

          Show
          aw Allen Wittenauer added a comment - That's very helpful! (Especially since that was going to be the next place I looked since I just happen to have it cloned from git on my dev machine.... it's going to be one of the first big tests I do as I work towards a commit-able patch. ) Bigtop looks like it is doing what I would expect: setting it for Hadoop, but not using it directly. Which seems to indicate that, at least as far as Bigtop is concerned, we could expand the definition beyond "it must be a user". Hadoop also uses HADOOP_IDENT_STR as the setting for the Java hadoop.id.str property. But I can't find a single place where this property is used. IIRC, it was used in ancient times for logging and/or display, but if we don't need the property set anymore because we've gotten wiser, I'd like to just yank that property completely.
          Hide
          mgrover Mark Grover added a comment -

          Great! Yeah, sounds good to me and in my personal opinion, Bigtop will be ok with expanding the definition. Just let us know when you make that change and what release it would show up in

          And, we don't use HADOOP_IDENT_STR, so no objections from Bigtop side there.

          Let me (or dev@bigtop.apache.org) know if you need anything else. Thank you!

          Show
          mgrover Mark Grover added a comment - Great! Yeah, sounds good to me and in my personal opinion, Bigtop will be ok with expanding the definition. Just let us know when you make that change and what release it would show up in And, we don't use HADOOP_IDENT_STR, so no objections from Bigtop side there. Let me (or dev@bigtop.apache.org) know if you need anything else. Thank you!
          Hide
          aw Allen Wittenauer added a comment -

          So, i found the only place where hadoop.id.str is still getting used (other than setting it):

          ./bigtop-packages/src/common/hadoop/conf.secure/log4j.properties:log4j.appender.DRFAS.File=/var/local/hadoop/logs/${hadoop.id.str}/${hadoop.id.str}-auth.log
          

          On the surface, this looks like a pretty good use case. So I suppose this property lives for another day. But I'm going to nuke yarn.id.str from the face of the earth since nothing references it.

          Show
          aw Allen Wittenauer added a comment - So, i found the only place where hadoop.id.str is still getting used (other than setting it): ./bigtop-packages/src/common/hadoop/conf.secure/log4j.properties:log4j.appender.DRFAS.File=/ var /local/hadoop/logs/${hadoop.id.str}/${hadoop.id.str}-auth.log On the surface, this looks like a pretty good use case. So I suppose this property lives for another day. But I'm going to nuke yarn.id.str from the face of the earth since nothing references it.
          Hide
          mgrover Mark Grover added a comment -

          Sounds good to me!

          Show
          mgrover Mark Grover added a comment - Sounds good to me!
          Hide
          hadoopqa Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12628148/HADOOP-9902.txt
          against trunk revision .

          -1 patch. The patch command could not apply the patch.

          Console output: https://builds.apache.org/job/PreCommit-HADOOP-Build/3961//console

          This message is automatically generated.

          Show
          hadoopqa Hadoop QA added a comment - -1 overall . Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12628148/HADOOP-9902.txt against trunk revision . -1 patch . The patch command could not apply the patch. Console output: https://builds.apache.org/job/PreCommit-HADOOP-Build/3961//console This message is automatically generated.
          Hide
          aw Allen Wittenauer added a comment -

          Here's the latest version of this patch against trunk (svn revision 1598750). There are still a few things I want to fix and I'm sure there are bugs floating around here and there. I'd appreciate any feedback with the patch thus far!

          Show
          aw Allen Wittenauer added a comment - Here's the latest version of this patch against trunk (svn revision 1598750). There are still a few things I want to fix and I'm sure there are bugs floating around here and there. I'd appreciate any feedback with the patch thus far!
          Hide
          aw Allen Wittenauer added a comment -

          Let's try this again:

          git rev ca2d0153bf3ec2f7f228bb1e68c0cadf4fb2d6c5
          svn rev 1598764

          Show
          aw Allen Wittenauer added a comment - Let's try this again: git rev ca2d0153bf3ec2f7f228bb1e68c0cadf4fb2d6c5 svn rev 1598764
          Hide
          aw Allen Wittenauer added a comment -

          Should apply to git commit faf2d78012fd6fcf5fc433ab85b2dbc9d672c125

          Show
          aw Allen Wittenauer added a comment - Should apply to git commit faf2d78012fd6fcf5fc433ab85b2dbc9d672c125
          Hide
          hadoopqa Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12653500/HADOOP-9902-2.patch
          against trunk revision .

          +1 @author. The patch does not contain any @author tags.

          -1 tests included. The patch doesn't appear to include any new or modified tests.
          Please justify why no new tests are needed for this patch.
          Also please list what manual steps were performed to verify this patch.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          +1 javadoc. There were no new javadoc warning messages.

          +1 eclipse:eclipse. The patch built with eclipse:eclipse.

          +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          -1 core tests. The patch failed these unit tests in hadoop-assemblies hadoop-common-project/hadoop-common hadoop-hdfs-project/hadoop-hdfs:

          org.apache.hadoop.ha.TestZKFailoverControllerStress
          org.apache.hadoop.hdfs.server.namenode.TestEditLog

          +1 contrib tests. The patch passed contrib unit tests.

          Test results: https://builds.apache.org/job/PreCommit-HADOOP-Build/4196//testReport/
          Console output: https://builds.apache.org/job/PreCommit-HADOOP-Build/4196//console

          This message is automatically generated.

          Show
          hadoopqa Hadoop QA added a comment - -1 overall . Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12653500/HADOOP-9902-2.patch against trunk revision . +1 @author . The patch does not contain any @author tags. -1 tests included . The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. +1 javac . The applied patch does not increase the total number of javac compiler warnings. +1 javadoc . There were no new javadoc warning messages. +1 eclipse:eclipse . The patch built with eclipse:eclipse. +1 findbugs . The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit . The applied patch does not increase the total number of release audit warnings. -1 core tests . The patch failed these unit tests in hadoop-assemblies hadoop-common-project/hadoop-common hadoop-hdfs-project/hadoop-hdfs: org.apache.hadoop.ha.TestZKFailoverControllerStress org.apache.hadoop.hdfs.server.namenode.TestEditLog +1 contrib tests . The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-HADOOP-Build/4196//testReport/ Console output: https://builds.apache.org/job/PreCommit-HADOOP-Build/4196//console This message is automatically generated.
          Hide
          aw Allen Wittenauer added a comment -

          Those failures don't look related to this patch.

          My to do list:

          • Write some release notes (incompat changes, new features, bug fixes)
          • Some relatively minor code cleanup
          • Add some better comments on hadoop-functions.sh, including expected input/output for users who replace those functions
          • Add some developer notes ("how do i add a subcommand?" "how do i add a new command?"), probably on the wiki
          • More security testing
          • Create some related JIRAs (unit testing, rewrite the scripts that I'm skipping this pass)

          FWIW, I'm still leaning towards breaking 'hadoop classpath'. I've had a few discussions offline and many people seem to think that breaking it is actually a good idea.

          That said... it'd be good to have more folks test this in the Real World. I'd like to get this committed soon-ish.

          Show
          aw Allen Wittenauer added a comment - Those failures don't look related to this patch. My to do list: Write some release notes (incompat changes, new features, bug fixes) Some relatively minor code cleanup Add some better comments on hadoop-functions.sh, including expected input/output for users who replace those functions Add some developer notes ("how do i add a subcommand?" "how do i add a new command?"), probably on the wiki More security testing Create some related JIRAs (unit testing, rewrite the scripts that I'm skipping this pass) FWIW, I'm still leaning towards breaking 'hadoop classpath'. I've had a few discussions offline and many people seem to think that breaking it is actually a good idea. That said... it'd be good to have more folks test this in the Real World. I'd like to get this committed soon-ish.
          Hide
          brocknoland Brock Noland added a comment -

          Removal of this usability improvement should be rolled back. Without it newbie hadoop users get terrible error messages for common mistakes.

          -    elif [[ "$COMMAND" = -*  ]] ; then
          -        # class and package names cannot begin with a -
          -        echo "Error: No command named \`$COMMAND' was found. Perhaps you meant \`hadoop ${COMMAND#-}'"
          -        exit 1
          
          Show
          brocknoland Brock Noland added a comment - Removal of this usability improvement should be rolled back. Without it newbie hadoop users get terrible error messages for common mistakes. - elif [[ "$COMMAND" = -* ]] ; then - # class and package names cannot begin with a - - echo "Error: No command named \`$COMMAND' was found. Perhaps you meant \`hadoop ${COMMAND#-}'" - exit 1
          Hide
          aw Allen Wittenauer added a comment -

          Great feedback!

          Here's a new patch that incorporates that change as well as fixing a ton of bugs (esp around secure daemons), has some code cleanup, env var namespace cleanup, and one or two minor new features. (See the release notes).

          I think I'm at the point where I need other people to start banging on this patch to test it out. The sooner we get it into trunk, the better.

          Show
          aw Allen Wittenauer added a comment - Great feedback! Here's a new patch that incorporates that change as well as fixing a ton of bugs (esp around secure daemons), has some code cleanup, env var namespace cleanup, and one or two minor new features. (See the release notes). I think I'm at the point where I need other people to start banging on this patch to test it out. The sooner we get it into trunk, the better.
          Hide
          hadoopqa Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12654218/HADOOP-9902-3.patch
          against trunk revision .

          +1 @author. The patch does not contain any @author tags.

          -1 tests included. The patch doesn't appear to include any new or modified tests.
          Please justify why no new tests are needed for this patch.
          Also please list what manual steps were performed to verify this patch.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          +1 javadoc. There were no new javadoc warning messages.

          +1 eclipse:eclipse. The patch built with eclipse:eclipse.

          +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          -1 core tests. The patch failed these unit tests in hadoop-assemblies hadoop-common-project/hadoop-common hadoop-hdfs-project/hadoop-hdfs:

          org.apache.hadoop.ha.TestZKFailoverControllerStress

          +1 contrib tests. The patch passed contrib unit tests.

          Test results: https://builds.apache.org/job/PreCommit-HADOOP-Build/4221//testReport/
          Console output: https://builds.apache.org/job/PreCommit-HADOOP-Build/4221//console

          This message is automatically generated.

          Show
          hadoopqa Hadoop QA added a comment - -1 overall . Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12654218/HADOOP-9902-3.patch against trunk revision . +1 @author . The patch does not contain any @author tags. -1 tests included . The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. +1 javac . The applied patch does not increase the total number of javac compiler warnings. +1 javadoc . There were no new javadoc warning messages. +1 eclipse:eclipse . The patch built with eclipse:eclipse. +1 findbugs . The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit . The applied patch does not increase the total number of release audit warnings. -1 core tests . The patch failed these unit tests in hadoop-assemblies hadoop-common-project/hadoop-common hadoop-hdfs-project/hadoop-hdfs: org.apache.hadoop.ha.TestZKFailoverControllerStress +1 contrib tests . The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-HADOOP-Build/4221//testReport/ Console output: https://builds.apache.org/job/PreCommit-HADOOP-Build/4221//console This message is automatically generated.
          Hide
          jira.shegalov Gera Shegalov added a comment -

          Allen Wittenauer, would you mind adding a subcommand that prints the native lib path as was proposed in HADOOP-8797?

          Show
          jira.shegalov Gera Shegalov added a comment - Allen Wittenauer , would you mind adding a subcommand that prints the native lib path as was proposed in HADOOP-8797 ?
          Hide
          aw Allen Wittenauer added a comment -

          Updated patch for git rev 8d5e8c860ed361ed792affcfe06f1a34b017e421.

          This includes many edge case bug fixes, a much more consistent coding style, the requested addition of the hadoop jnipath command, and a run through shellcheck.

          Show
          aw Allen Wittenauer added a comment - Updated patch for git rev 8d5e8c860ed361ed792affcfe06f1a34b017e421. This includes many edge case bug fixes, a much more consistent coding style, the requested addition of the hadoop jnipath command, and a run through shellcheck.
          Hide
          hadoopqa Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12655595/HADOOP-9902-4.patch
          against trunk revision .

          +1 @author. The patch does not contain any @author tags.

          -1 tests included. The patch doesn't appear to include any new or modified tests.
          Please justify why no new tests are needed for this patch.
          Also please list what manual steps were performed to verify this patch.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          +1 javadoc. There were no new javadoc warning messages.

          +1 eclipse:eclipse. The patch built with eclipse:eclipse.

          +1 findbugs. The patch does not introduce any new Findbugs (version 2.0.3) warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          -1 core tests. The patch failed these unit tests in hadoop-assemblies hadoop-common-project/hadoop-common hadoop-hdfs-project/hadoop-hdfs:

          org.apache.hadoop.fs.TestSymlinkLocalFSFileContext
          org.apache.hadoop.ipc.TestIPC
          org.apache.hadoop.fs.TestSymlinkLocalFSFileSystem
          org.apache.hadoop.hdfs.tools.TestDFSAdminWithHA

          +1 contrib tests. The patch passed contrib unit tests.

          Test results: https://builds.apache.org/job/PreCommit-HADOOP-Build/4265//testReport/
          Console output: https://builds.apache.org/job/PreCommit-HADOOP-Build/4265//console

          This message is automatically generated.

          Show
          hadoopqa Hadoop QA added a comment - -1 overall . Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12655595/HADOOP-9902-4.patch against trunk revision . +1 @author . The patch does not contain any @author tags. -1 tests included . The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. +1 javac . The applied patch does not increase the total number of javac compiler warnings. +1 javadoc . There were no new javadoc warning messages. +1 eclipse:eclipse . The patch built with eclipse:eclipse. +1 findbugs . The patch does not introduce any new Findbugs (version 2.0.3) warnings. +1 release audit . The applied patch does not increase the total number of release audit warnings. -1 core tests . The patch failed these unit tests in hadoop-assemblies hadoop-common-project/hadoop-common hadoop-hdfs-project/hadoop-hdfs: org.apache.hadoop.fs.TestSymlinkLocalFSFileContext org.apache.hadoop.ipc.TestIPC org.apache.hadoop.fs.TestSymlinkLocalFSFileSystem org.apache.hadoop.hdfs.tools.TestDFSAdminWithHA +1 contrib tests . The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-HADOOP-Build/4265//testReport/ Console output: https://builds.apache.org/job/PreCommit-HADOOP-Build/4265//console This message is automatically generated.
          Hide
          aw Allen Wittenauer added a comment -

          This same patch applies to branch-2, if someone wants to play on a relatively closer to live system.

          Show
          aw Allen Wittenauer added a comment - This same patch applies to branch-2, if someone wants to play on a relatively closer to live system.
          Hide
          aw Allen Wittenauer added a comment -

          Added https://wiki.apache.org/hadoop/ShellScriptProgrammingGuide .

          Also, a status update:

          I have some output from kill to send to /dev/null (triggers extraneous output when the daemon is already down) and some updates on the comments left to do. Barring any additional feedback, my goals for this patch have been met and I mostly consider it finished.

          Show
          aw Allen Wittenauer added a comment - Added https://wiki.apache.org/hadoop/ShellScriptProgrammingGuide . Also, a status update: I have some output from kill to send to /dev/null (triggers extraneous output when the daemon is already down) and some updates on the comments left to do. Barring any additional feedback, my goals for this patch have been met and I mostly consider it finished.
          Hide
          aw Allen Wittenauer added a comment -

          vs -5, this patch fixes a kill output not sent to /dev/null, updates some of the hadoop-env.sh commentary, and fixes a code style issue in rotate_logs.

          If there is interest, I can make a patch for 2.4.1.

          Show
          aw Allen Wittenauer added a comment - vs -5, this patch fixes a kill output not sent to /dev/null, updates some of the hadoop-env.sh commentary, and fixes a code style issue in rotate_logs. If there is interest, I can make a patch for 2.4.1.
          Hide
          aw Allen Wittenauer added a comment -

          A note for me: HDFS-2256 has an interesting idea for start-dfs.sh.

          Show
          aw Allen Wittenauer added a comment - A note for me: HDFS-2256 has an interesting idea for start-dfs.sh.
          Hide
          aw Allen Wittenauer added a comment -

          This version:

          • fixes the commit conflict with HADOOP-9921
          • Documents a few missing env vars (that have been missing since those features were committed!)
          • Moves some defaults out of hadoop-env.sh into hdfs-config.sh so that hadoop-env.sh can run empty
          Show
          aw Allen Wittenauer added a comment - This version: fixes the commit conflict with HADOOP-9921 Documents a few missing env vars (that have been missing since those features were committed!) Moves some defaults out of hadoop-env.sh into hdfs-config.sh so that hadoop-env.sh can run empty
          Hide
          aw Allen Wittenauer added a comment -

          a) fixed some bugs with HADOOP_SSH_OPTS
          b) proxyserver fell out of the usage information at some point
          c) resourcemanager -format-state-store got truncated
          d) minor whitespace fix

          Show
          aw Allen Wittenauer added a comment - a) fixed some bugs with HADOOP_SSH_OPTS b) proxyserver fell out of the usage information at some point c) resourcemanager -format-state-store got truncated d) minor whitespace fix
          Hide
          aw Allen Wittenauer added a comment - - edited

          Did a quick review with Owen O'Malley and Giridharan Kesavan today. Two things came out of that:

          • HADOOP_CONF_DIR should be at the back of the classpath
          • --daemon should support 'status' in addition to start and stop (which means I'll close YARN-2346 as a dupe of this one)

          The first one is easy to do.

          The second one will be a bit more work, but nothing horrific.

          It should be noted that these are both improvements/new functionality vs. the current code in trunk.

          Show
          aw Allen Wittenauer added a comment - - edited Did a quick review with Owen O'Malley and Giridharan Kesavan today. Two things came out of that: HADOOP_CONF_DIR should be at the back of the classpath --daemon should support 'status' in addition to start and stop (which means I'll close YARN-2346 as a dupe of this one) The first one is easy to do. The second one will be a bit more work, but nothing horrific. It should be noted that these are both improvements/new functionality vs. the current code in trunk.
          Hide
          aw Allen Wittenauer added a comment -

          Latest patch addresses the comments from the review. It also fixes some stylistic issues, cleans up some errors, and fixes a typo with lsSnapshotDir.

          Show
          aw Allen Wittenauer added a comment - Latest patch addresses the comments from the review. It also fixes some stylistic issues, cleans up some errors, and fixes a typo with lsSnapshotDir.
          Hide
          aw Allen Wittenauer added a comment -

          HADOOP_CONF_DIR should be at the back of the classpath

          Doing this broke logging. I'll be putting up a new patch that reverts that.

          Show
          aw Allen Wittenauer added a comment - HADOOP_CONF_DIR should be at the back of the classpath Doing this broke logging. I'll be putting up a new patch that reverts that.
          Hide
          aw Allen Wittenauer added a comment -

          #10: Relatively minor bug fixes, enhanced error checking for log and pid dirs, more style cleanup, start/stop-dfs message consistency.

          Also, this has been run through (more or less) the hadoop smoke tests from Bigtop trunk. For some reason, some tests didn't work, such as the append test. I didn't bother troubleshooting those. The non-admin CLI tests that failed under bigtop I was able to run manually, so I suspect they failed for other, environmental reasons. Admin CLI tests passed. (For example, the MR tests failed because it looks for MR in /usr/lib/hadoop-mapreduce, despite having the MR home defined. But I can run, e.g., the pi example just fine using the exact same command it has in the test framework.)

          It'd still be great if someone could run this patch through the paces though. I'm running out of things to fix...

          Show
          aw Allen Wittenauer added a comment - #10: Relatively minor bug fixes, enhanced error checking for log and pid dirs, more style cleanup, start/stop-dfs message consistency. Also, this has been run through (more or less) the hadoop smoke tests from Bigtop trunk. For some reason, some tests didn't work, such as the append test. I didn't bother troubleshooting those. The non-admin CLI tests that failed under bigtop I was able to run manually, so I suspect they failed for other, environmental reasons. Admin CLI tests passed. (For example, the MR tests failed because it looks for MR in /usr/lib/hadoop-mapreduce, despite having the MR home defined. But I can run, e.g., the pi example just fine using the exact same command it has in the test framework.) It'd still be great if someone could run this patch through the paces though. I'm running out of things to fix...
          Hide
          aw Allen Wittenauer added a comment -

          RE: 'hadoop classpath'

          I've had quite a few discussions off-JIRA with folks about this subcommand. Almost universally, everyone is in favor of it just returning common, and adding xyz classpath subcommands to return their subproject jars. It's also one of the few remaining, hardcore things that prevent us from breaking some of the dependencies between projects.

          While playing with the bigtop tests today, I was reminded why we can't break classpath without consequence: hadoop jar, and to a lesser extend, hadoop distcp, hadoop archive, hadoop CLASSNAME. Unless we either want to play games with the classpath code and have 'hadoop' pretend to be 'yarn' or 'mapred' or whatever for certain subcommands, hadoop classpath will almost certainly always have to return the full one. Or, as Steve Loughran mentioned, use an additional param to force it to be common only.

          Show
          aw Allen Wittenauer added a comment - RE: 'hadoop classpath' I've had quite a few discussions off-JIRA with folks about this subcommand. Almost universally, everyone is in favor of it just returning common, and adding xyz classpath subcommands to return their subproject jars. It's also one of the few remaining, hardcore things that prevent us from breaking some of the dependencies between projects. While playing with the bigtop tests today, I was reminded why we can't break classpath without consequence: hadoop jar, and to a lesser extend, hadoop distcp, hadoop archive, hadoop CLASSNAME. Unless we either want to play games with the classpath code and have 'hadoop' pretend to be 'yarn' or 'mapred' or whatever for certain subcommands, hadoop classpath will almost certainly always have to return the full one. Or, as Steve Loughran mentioned, use an additional param to force it to be common only.
          Hide
          aw Allen Wittenauer added a comment -

          #11 - This is essentially the same patch, but incorporates two trivial changes that I missed/forgot to fix: HADOOP-6746 and HADOOP-6851. The latter has been filed in different forms for several years under a variety of different JIRAs.

          It has also been recommended by a PMC member that I give this patch a week and if no issues pop up, to just commit it to trunk. So let's start the clock on that.

          Show
          aw Allen Wittenauer added a comment - #11 - This is essentially the same patch, but incorporates two trivial changes that I missed/forgot to fix: HADOOP-6746 and HADOOP-6851 . The latter has been filed in different forms for several years under a variety of different JIRAs. It has also been recommended by a PMC member that I give this patch a week and if no issues pop up, to just commit it to trunk. So let's start the clock on that.
          Hide
          andrew.wang Andrew Wang added a comment -

          Allen, please do not commit this until it's gotten a detailed review. I'm sympathetic to how it's been sitting around for a while (and it's great that you've run it through bigtop), but I'm still leery since this is a large patch to a sensitive part of the codebase.

          Show
          andrew.wang Andrew Wang added a comment - Allen, please do not commit this until it's gotten a detailed review. I'm sympathetic to how it's been sitting around for a while (and it's great that you've run it through bigtop), but I'm still leery since this is a large patch to a sensitive part of the codebase.
          Hide
          tucu00 Alejandro Abdelnur added a comment -

          I've pinged Roman Shaposhnik offline asking if he would have time to review this as he has intimate knowledge of the current scripts. He is up for it, but he is on vacations till the end of the week. He would jump on it next week.

          Show
          tucu00 Alejandro Abdelnur added a comment - I've pinged Roman Shaposhnik offline asking if he would have time to review this as he has intimate knowledge of the current scripts. He is up for it, but he is on vacations till the end of the week. He would jump on it next week.
          Hide
          schu Stephen Chu added a comment -

          In bin/hadoop usage, the latest patch has:

          echo " key undocumented"
          echo " credential undocumented"

          Should we add the descriptions? For key, it can be something like "manage keys via the KeyProvider."

          Show
          schu Stephen Chu added a comment - In bin/hadoop usage, the latest patch has: echo " key undocumented" echo " credential undocumented" Should we add the descriptions? For key, it can be something like "manage keys via the KeyProvider."
          Hide
          stevel@apache.org Steve Loughran added a comment -

          This is major patch and something we've needed for a long time ... I think we ought to consider a strategy of "get it in trunk" and then evolve it until happy, finally backporting to branch-2.

          Alan, have you a branch on github I can merge (and track) in my own code so that I can play with it in a local kerberized VM running trunk?

          Show
          stevel@apache.org Steve Loughran added a comment - This is major patch and something we've needed for a long time ... I think we ought to consider a strategy of "get it in trunk" and then evolve it until happy, finally backporting to branch-2. Alan, have you a branch on github I can merge (and track) in my own code so that I can play with it in a local kerberized VM running trunk?
          Hide
          aw Allen Wittenauer added a comment -

          Should we add the descriptions?

          We should. I've been waiting for someone to notice. I put it that way (or another) almost immediately after the code that added that option got committed to trunk. That was about a month or so ago.

          I think we ought to consider a strategy of "get it in trunk" and then evolve it until happy, finally backporting to branch-2.

          I've had a lot of discussions with a lot of committers over the past year regarding this change. (Including a significant amount of begging to get a review done). To me, pushing this to trunk where end user impact is still extremely low, incompatible changes are OK, and gives the community a chance to play is really the only viable strategy to build confidence with the vast majority of the committership, review or otherwise. It's unfortunately a large patch. It's unfortunately impractical to split apart due to the nature of the current code base (as soon as you touch hadoop-config.sh or *-env.sh, you impact everything ...) . Realistically, there will be breakage. It's an unattainable goal to move this code forward in a significant way while maintaining 100% compatibility. It's one of the reasons why I changed reset my personal goal to get it just into trunk rather than branch-2. The best we can do is fix or document those cases we identify that will break. The worst is to maintain the status quo, continue ignoring the operational issues the current shell code inflicts, digging the hole deeper with every change that we make (.e.g., YARN-1429).

          have you a branch on github I can merge

          I did at one point, but I gave up. There is no question this is a large patch covering a lot of ground. The manual merges every few weeks with every sub command that got added or every minor tweak got to be too much, esp for a branch where I would be the only one paying attention. (As a result, I feel as though I could have a lovely conversation with Sisyphus.)

          Show
          aw Allen Wittenauer added a comment - Should we add the descriptions? We should. I've been waiting for someone to notice. I put it that way (or another) almost immediately after the code that added that option got committed to trunk. That was about a month or so ago. I think we ought to consider a strategy of "get it in trunk" and then evolve it until happy, finally backporting to branch-2. I've had a lot of discussions with a lot of committers over the past year regarding this change. (Including a significant amount of begging to get a review done). To me, pushing this to trunk where end user impact is still extremely low, incompatible changes are OK, and gives the community a chance to play is really the only viable strategy to build confidence with the vast majority of the committership, review or otherwise. It's unfortunately a large patch. It's unfortunately impractical to split apart due to the nature of the current code base (as soon as you touch hadoop-config.sh or *-env.sh, you impact everything ...) . Realistically, there will be breakage. It's an unattainable goal to move this code forward in a significant way while maintaining 100% compatibility. It's one of the reasons why I changed reset my personal goal to get it just into trunk rather than branch-2. The best we can do is fix or document those cases we identify that will break. The worst is to maintain the status quo, continue ignoring the operational issues the current shell code inflicts, digging the hole deeper with every change that we make (.e.g., YARN-1429 ). have you a branch on github I can merge I did at one point, but I gave up. There is no question this is a large patch covering a lot of ground. The manual merges every few weeks with every sub command that got added or every minor tweak got to be too much, esp for a branch where I would be the only one paying attention. (As a result, I feel as though I could have a lovely conversation with Sisyphus.)
          Hide
          aw Allen Wittenauer added a comment -

          rebase for git rev d469993fe9b270d01c4cf4542bb701297996fbac

          Minor stuff:

          • add usage for keys, credential
          • dirs with spaces should work again (this is very very broken in trunk btw)
          • merged in hadoop classpath changes from HADOOP-10903
          • probably worthwhile to point out that this repairs the defaults that HADOOP-10759 broke

          ... will commit on Friday.

          Show
          aw Allen Wittenauer added a comment - rebase for git rev d469993fe9b270d01c4cf4542bb701297996fbac Minor stuff: add usage for keys, credential dirs with spaces should work again (this is very very broken in trunk btw) merged in hadoop classpath changes from HADOOP-10903 probably worthwhile to point out that this repairs the defaults that HADOOP-10759 broke ... will commit on Friday.
          Hide
          andrew.wang Andrew Wang added a comment -

          I'm -1 on putting this in without review. Roman is apparently going to give this a look, so let's wait for that.

          Show
          andrew.wang Andrew Wang added a comment - I'm -1 on putting this in without review. Roman is apparently going to give this a look, so let's wait for that.
          Hide
          aw Allen Wittenauer added a comment -

          FWIW, I asked Roman (in person even!) over a month ago to look at it. So what do you view as a reasonable timeline to wait Andrew Wang? Another month? Two? Or is the clock based on rebasing? Have you even tried it?

          Show
          aw Allen Wittenauer added a comment - FWIW, I asked Roman (in person even!) over a month ago to look at it. So what do you view as a reasonable timeline to wait Andrew Wang ? Another month? Two? Or is the clock based on rebasing? Have you even tried it?
          Hide
          andrew.wang Andrew Wang added a comment -

          It's right in the bylaws, we do review-then-commit here in Hadoop, and I don't see a review and +1 from another committer. I'm not sure what else needs to be said, and angry words aren't going to get me to retract my veto.

          Show
          andrew.wang Andrew Wang added a comment - It's right in the bylaws, we do review-then-commit here in Hadoop, and I don't see a review and +1 from another committer. I'm not sure what else needs to be said, and angry words aren't going to get me to retract my veto.
          Hide
          apurtell Andrew Purtell added a comment -

          we do review-then-commit here in Hadoop

          More like ignore-and-drop but I digress.

          Show
          apurtell Andrew Purtell added a comment - we do review-then-commit here in Hadoop More like ignore-and-drop but I digress.
          Hide
          atm Aaron T. Myers added a comment -

          Andrew Purtell - that's obviously not a very helpful comment. If you want to have a conversation about our review bandwidth here in the Hadoop project in general, then I'd suggest doing so on common-dev@h.a.o.

          Show
          atm Aaron T. Myers added a comment - Andrew Purtell - that's obviously not a very helpful comment. If you want to have a conversation about our review bandwidth here in the Hadoop project in general, then I'd suggest doing so on common-dev@h.a.o.
          Hide
          apurtell Andrew Purtell added a comment -

          This issue has been open for a year and has 40 watchers (a good proxy for general interest), yet the contributor is begging for a review and being shut down by a veto. This seems like a good venue to point that out since all the information is on hand at a glance. I'm sure Aaron T. Myers and Andrew Wang can and will tap each other on the shoulder if this or that HDFS patch needs to go in, but there's no bandwidth for review of third party contribution, decided intentionally or through accumulated carelessness. Case in point, this issue. Consider spending the time you might argue with me looking at the patch.

          Show
          apurtell Andrew Purtell added a comment - This issue has been open for a year and has 40 watchers (a good proxy for general interest), yet the contributor is begging for a review and being shut down by a veto. This seems like a good venue to point that out since all the information is on hand at a glance. I'm sure Aaron T. Myers and Andrew Wang can and will tap each other on the shoulder if this or that HDFS patch needs to go in, but there's no bandwidth for review of third party contribution, decided intentionally or through accumulated carelessness. Case in point, this issue. Consider spending the time you might argue with me looking at the patch.
          Hide
          atm Aaron T. Myers added a comment -

          Andrew Purtell - I think you're overreacting and overgeneralizing. If you care to look you will find many reviews/commits of many contributions with contributors and reviewers from different organizations. The JIRA has been open for a long time, yes, but I think Allen will admit that he has not been pushing on it hard until rather recently. The second rev of the patch was only posted at the beginning of last month.

          Regardless, I think there are two big reasons this JIRA has yet to be reviewed:

          1. It is a very large and important change, hence will require quite some time to thoroughly review, and it really does deserve a thorough review since it's such a critical part of the code.
          2. Many or most of the Hadoop committers are less familiar with shell scripting than they are with other types of programming, and so don't feel well qualified to do so. I know I personally don't.

          Regardless, turning this JIRA into a flame war is not an appropriate way to go about getting it committed. I'd personally love to see this get committed, but like Andrew I don't want to see it (or anything) committed without a review.

          Show
          atm Aaron T. Myers added a comment - Andrew Purtell - I think you're overreacting and overgeneralizing. If you care to look you will find many reviews/commits of many contributions with contributors and reviewers from different organizations. The JIRA has been open for a long time, yes, but I think Allen will admit that he has not been pushing on it hard until rather recently. The second rev of the patch was only posted at the beginning of last month. Regardless, I think there are two big reasons this JIRA has yet to be reviewed: It is a very large and important change, hence will require quite some time to thoroughly review, and it really does deserve a thorough review since it's such a critical part of the code. Many or most of the Hadoop committers are less familiar with shell scripting than they are with other types of programming, and so don't feel well qualified to do so. I know I personally don't. Regardless, turning this JIRA into a flame war is not an appropriate way to go about getting it committed. I'd personally love to see this get committed, but like Andrew I don't want to see it (or anything) committed without a review.
          Hide
          aw Allen Wittenauer added a comment -

          but I think Allen will admit that he has not been pushing on it hard until rather recently. The second rev of the patch was only posted at the beginning of last month.

          There was certainly a pause between Oct and Jan, but it should be pointed out that I deleted previous versions of the patch, in particular the ones not in patch format and ones against branch-2. (Sort by date helps here.) The main chunk of the code has been finished since September.

          Many or most of the Hadoop committers are less familiar with shell scripting than they are with other types of programming, and so don't feel well qualified to do so.

          This begs the question: who has been reviewing all the new shell code that is going in then? (We've had entire new scripts thrown in from all over, not just minor stuff!) From the basis of this thread, it sounds like the only people who should be reviewing shell code is Roman and (I guess) myself. Yet I haven't been asked to look at any in years...

          Show
          aw Allen Wittenauer added a comment - but I think Allen will admit that he has not been pushing on it hard until rather recently. The second rev of the patch was only posted at the beginning of last month. There was certainly a pause between Oct and Jan, but it should be pointed out that I deleted previous versions of the patch, in particular the ones not in patch format and ones against branch-2. (Sort by date helps here.) The main chunk of the code has been finished since September. Many or most of the Hadoop committers are less familiar with shell scripting than they are with other types of programming, and so don't feel well qualified to do so. This begs the question: who has been reviewing all the new shell code that is going in then? (We've had entire new scripts thrown in from all over, not just minor stuff!) From the basis of this thread, it sounds like the only people who should be reviewing shell code is Roman and (I guess) myself. Yet I haven't been asked to look at any in years...
          Hide
          stevel@apache.org Steve Loughran added a comment -

          ..even without review by the bash experts, there's nothing to stop the rest of us applying the patch locally to branch-2 and seeing if it works for our personal deployments

          Show
          stevel@apache.org Steve Loughran added a comment - ..even without review by the bash experts, there's nothing to stop the rest of us applying the patch locally to branch-2 and seeing if it works for our personal deployments
          Hide
          aw Allen Wittenauer added a comment - - edited

          Rebase for current (git rev 66af8b0ed51f082889be3d39f63e28f5920e5cb6) and fixes a bug introduced by HADOOP-10927.

          Additionally, it was requested I provide a branch-2 patch for people to try out. This was a 'lazy' backport: not well tested and probably includes commands that don't exist in branch-2. Clearly not meant to be committed. git rev 13f4ab3e68c6b34e915f807ad67fe4d979b35995

          Show
          aw Allen Wittenauer added a comment - - edited Rebase for current (git rev 66af8b0ed51f082889be3d39f63e28f5920e5cb6) and fixes a bug introduced by HADOOP-10927 . Additionally, it was requested I provide a branch-2 patch for people to try out. This was a 'lazy' backport: not well tested and probably includes commands that don't exist in branch-2. Clearly not meant to be committed. git rev 13f4ab3e68c6b34e915f807ad67fe4d979b35995
          Hide
          mgrover Mark Grover added a comment -

          While I don't consider myself a Hadoop or a bash scripting "expert", I think I have a reasonable understanding of the both and I have been following this JIRA along and had made a few comments back in May, so I will be happy to take another look as well. Stay tuned!

          Show
          mgrover Mark Grover added a comment - While I don't consider myself a Hadoop or a bash scripting "expert", I think I have a reasonable understanding of the both and I have been following this JIRA along and had made a few comments back in May, so I will be happy to take another look as well. Stay tuned!
          Hide
          rvs Roman Shaposhnik added a comment -

          The patch is huge. It is also a needed one. I don't think it is realistic to expect anybody be able to just look at the patch and predict all the potential changes to the current behavior. I don't think this should be the goal of the review. The way I'm approaching it is this: if this was a from-scratch implementation would it be reasonable.

          IOW, reviewing a diff is, in my opinion, futile. Reviewing the final state of the code is fruitful. I'll post my comments from that point of view later on, but I wanted to let everybody know the scope of my review first.

          Show
          rvs Roman Shaposhnik added a comment - The patch is huge. It is also a needed one. I don't think it is realistic to expect anybody be able to just look at the patch and predict all the potential changes to the current behavior. I don't think this should be the goal of the review. The way I'm approaching it is this: if this was a from-scratch implementation would it be reasonable. IOW, reviewing a diff is, in my opinion, futile. Reviewing the final state of the code is fruitful. I'll post my comments from that point of view later on, but I wanted to let everybody know the scope of my review first.
          Hide
          aw Allen Wittenauer added a comment -

          The way I'm approaching it is this: if this was a from-scratch implementation would it be reasonable.

          I think that's fair as long as (and I know we've talked about it before, but to clarify to the wider audience) that we also keep in mind that in some ways we're hamstrung by history. If only we could completely break compatibility... how different this change would be! (#1: HADOOP_OPTS would die in a mysterious fire.) There are lots of things that are 'not proper' because of that.

          IOW, reviewing a diff is, in my opinion, futile.

          Absolutely correct. This was never intended to be reviewed as a diff in the traditional sense.

          Show
          aw Allen Wittenauer added a comment - The way I'm approaching it is this: if this was a from-scratch implementation would it be reasonable. I think that's fair as long as (and I know we've talked about it before, but to clarify to the wider audience) that we also keep in mind that in some ways we're hamstrung by history. If only we could completely break compatibility... how different this change would be! (#1: HADOOP_OPTS would die in a mysterious fire.) There are lots of things that are 'not proper' because of that. IOW, reviewing a diff is, in my opinion, futile. Absolutely correct. This was never intended to be reviewed as a diff in the traditional sense.
          Hide
          aw Allen Wittenauer added a comment -

          OK, after being forced to think about JAVA_HEAP_MAX more due to HADOOP-10759, it's pretty clear I made a mistake here in the rewrite. In particular:

          a) I mentioned it in the *-env.sh files. It really should be excised from the documentation so that users don't see it.

          b) It looks like there are multiple code paths where this default gets set. This is (obviously) wrong. Luckily, the correct one is the one that wins, but this really needs to get fixed.

          Of course, for the most part, there is no real impact on the actual functioning of the code so this doesn't have much of an impact on any review. So I'm going to wait until Roman comes back before creating a patch that fixes this (and the space-in-the-path mistake, which is a pretty minor change as well: just add eval in front of the execs. Meh.). I'll either fix them in a subsequent JIRA or in a revision to this patch... whichever comes first, based upon input.

          In a perfect world, I'd remove JAVA_HEAP_MAX entirely since it doesn't really serve much of a purpose (it's really only meant to be used internally only) and rely completely on HADOOP_HEAPSIZE.... but I have a hunch that 3rd party folks are using it.

          Show
          aw Allen Wittenauer added a comment - OK, after being forced to think about JAVA_HEAP_MAX more due to HADOOP-10759 , it's pretty clear I made a mistake here in the rewrite. In particular: a) I mentioned it in the *-env.sh files. It really should be excised from the documentation so that users don't see it. b) It looks like there are multiple code paths where this default gets set. This is (obviously) wrong. Luckily, the correct one is the one that wins, but this really needs to get fixed. Of course, for the most part, there is no real impact on the actual functioning of the code so this doesn't have much of an impact on any review. So I'm going to wait until Roman comes back before creating a patch that fixes this (and the space-in-the-path mistake, which is a pretty minor change as well: just add eval in front of the execs. Meh.). I'll either fix them in a subsequent JIRA or in a revision to this patch... whichever comes first, based upon input. In a perfect world, I'd remove JAVA_HEAP_MAX entirely since it doesn't really serve much of a purpose (it's really only meant to be used internally only) and rely completely on HADOOP_HEAPSIZE.... but I have a hunch that 3rd party folks are using it.
          Hide
          aw Allen Wittenauer added a comment -

          I had a discussion with Andrew Wang last night around this patch. As part of that discussion, two points came to light that I think are important for everyone to understand:

          • Yes, testing this patch on a single node is valuable. I'd even argue its more valuable than testing at scale because now multiple daemons are exercising the code on a single machine which should show any conflicts. Testing on more than one node, where the nodes are configured the same, is basically exercising the ssh code.... and that's about it.
          • Even if you aren't comfortable with shell, your testing is still extremely valuable. Take your existing *-env.sh files, pop them in, and see what works and what doesn't work. Ideally, everything works without changes. If they don't work, check the incompat list above. If it isn't listed, please add a note here so I can see if this is a bug in the code or if it is something missing from the release notes.

          Thanks.

          Show
          aw Allen Wittenauer added a comment - I had a discussion with Andrew Wang last night around this patch. As part of that discussion, two points came to light that I think are important for everyone to understand: Yes, testing this patch on a single node is valuable. I'd even argue its more valuable than testing at scale because now multiple daemons are exercising the code on a single machine which should show any conflicts. Testing on more than one node, where the nodes are configured the same, is basically exercising the ssh code.... and that's about it. Even if you aren't comfortable with shell, your testing is still extremely valuable. Take your existing *-env.sh files, pop them in, and see what works and what doesn't work. Ideally, everything works without changes. If they don't work, check the incompat list above. If it isn't listed, please add a note here so I can see if this is a bug in the code or if it is something missing from the release notes. Thanks.
          Hide
          rvs Roman Shaposhnik added a comment -

          Allen Wittenauer that's my plan. I'm almost done with manual review. Next step is Bigtop-based testing on a fully distributed cluster.

          Show
          rvs Roman Shaposhnik added a comment - Allen Wittenauer that's my plan. I'm almost done with manual review. Next step is Bigtop-based testing on a fully distributed cluster.
          Hide
          rvs Roman Shaposhnik added a comment -

          Here are my comments. The only two that are the dealbreakers are:

          1. It seems that hadoop-functions.sh ends up in sbin/hadoop-functions.sh in the finally binary assembly but hadoop-config.sh looks for it in the HADOOP_LIBEXEC_DIR. Relates to this – I think we need to bail in hadoop-config.sh if hadoop-functions.sh can't be found.
          2. In hadoop-common-project/hadoop-common/src/main/bin/hadoop the following doesn't work:
            exec "${JAVA}" "${HADOOP_OPTS}" "${CLASS}" "$@"
            

            it needs to be changed so that HADOOP_OPTS is not quoted. Otherwise JDK gets confused.

          The rest of my notes are here for tracking purposes. I'd appreciate if Allen Wittenauer can comment.

          1. HADOOPOSTYPE needs to be documents
          2. Are we planning to use *-env.sh for documenting all the variables that one may set?
          3. Any reason populate_slaves_file function is in hadoop-config.sh and not in hadoop-functions.sh ?
          4. The following appears to be a sort of no-op if value is not set (not sure why extra if [[ -z)
              # default policy file for service-level authorization 
              if [[ -z "${HADOOP_POLICYFILE}" ]]; then
                HADOOP_POLICYFILE=${HADOOP_POLICYFILE:-"hadoop-policy.xml"}
              fi
            
          5. Any reason not to try harder and see what type -p java returns?
            # The java implementation to use.
            export JAVA_HOME=${JAVA_HOME:-"hadoop-env.sh is not configured"}
            
          Show
          rvs Roman Shaposhnik added a comment - Here are my comments. The only two that are the dealbreakers are: It seems that hadoop-functions.sh ends up in sbin/hadoop-functions.sh in the finally binary assembly but hadoop-config.sh looks for it in the HADOOP_LIBEXEC_DIR. Relates to this – I think we need to bail in hadoop-config.sh if hadoop-functions.sh can't be found. In hadoop-common-project/hadoop-common/src/main/bin/hadoop the following doesn't work: exec "${JAVA}" "${HADOOP_OPTS}" "${CLASS}" "$@" it needs to be changed so that HADOOP_OPTS is not quoted. Otherwise JDK gets confused. The rest of my notes are here for tracking purposes. I'd appreciate if Allen Wittenauer can comment. HADOOPOSTYPE needs to be documents Are we planning to use *-env.sh for documenting all the variables that one may set? Any reason populate_slaves_file function is in hadoop-config.sh and not in hadoop-functions.sh ? The following appears to be a sort of no-op if value is not set (not sure why extra if [[ -z) # default policy file for service-level authorization if [[ -z "${HADOOP_POLICYFILE}" ]]; then HADOOP_POLICYFILE=${HADOOP_POLICYFILE:-"hadoop-policy.xml"} fi Any reason not to try harder and see what type -p java returns? # The java implementation to use. export JAVA_HOME=${JAVA_HOME:-"hadoop-env.sh is not configured"}
          Hide
          aw Allen Wittenauer added a comment -

          It seems that hadoop-functions.sh ends up in sbin/hadoop-functions.sh in the finally binary assembly but hadoop-config.sh looks for it in the HADOOP_LIBEXEC_DIR.

          Weird! I've been using tar ball builds and it shows up in HADOOP_LIBEXEC_DIR / libexec as expected. I wonder what's going on here then. hadoop-dist.xml even specifies libexec and excludes it from sbin specifically. From my own fresh build of trunk + this patch:

          mbp:hadoop-3.0.0-SNAPSHOT aw$ pwd
          /Users/aw/Src/hadoop-common/hadoop-dist/target/hadoop-3.0.0-SNAPSHOT
          mbp:hadoop-3.0.0-SNAPSHOT aw$ find . -name hadoop-functions.sh
          ./libexec/hadoop-functions.sh
          

          So I guess I need some advice here on a way to reproduce it and/or how to fix this one.

          hadoop-common-project/hadoop-common/src/main/bin/hadoop's extra quotes

          Yup, that's clearly broken. Probably snuck in while I was playing with the space bug. That's an easy fix.

          HADOOPOSTYPE needs to be documents

          This is one of those things that I didn't know what to do with it. I really don't want it to exist, to be honest, but with HADOOP-8719, we have to do something in the hadoop-env.sh. Stupid Apple and/or Oracle. I think in the end we're just kind of screwed and will need to make it real. :/ I'll change it to HADOOP_OS_TYPE, define it both in hadoop-env.sh and hadoop-functions.sh so that it can be referenced as a real value.

          Are we planning to use *-env.sh for documenting all the variables that one may set?

          That was my intent, yes. I'm looking at it from the perspective of your typical /etc/default/*, /etc/sysconfig, etc, type file.

          Any reason populate_slaves_file function is in hadoop-config.sh and not in hadoop-functions.sh ?

          It was really meant as a utility function only for hadoop-config.sh to simplify the options listing/parsing code. But there's no reason it can't be moved. I'll do that as well as make it hadoop_ to keep the name space clean.

          The following appears to be a sort of no-op if value is not set

          Yup, correct. Another easy fix. (... and clearly bad copypasta from the current code ...)

          Any reason not to try harder and see what type -p java returns?

          I think it's too risky. I'd much rather have an explicit JAVA_HOME than assume that /usr/bin/java (or whatever happens to get returned) is "good". 'type -p' can also be wildly unpredictable since it uses the hashed value... this is not necessarily the first one that shows up in the path!

          Thanks!

          Show
          aw Allen Wittenauer added a comment - It seems that hadoop-functions.sh ends up in sbin/hadoop-functions.sh in the finally binary assembly but hadoop-config.sh looks for it in the HADOOP_LIBEXEC_DIR. Weird! I've been using tar ball builds and it shows up in HADOOP_LIBEXEC_DIR / libexec as expected. I wonder what's going on here then. hadoop-dist.xml even specifies libexec and excludes it from sbin specifically. From my own fresh build of trunk + this patch: mbp:hadoop-3.0.0-SNAPSHOT aw$ pwd /Users/aw/Src/hadoop-common/hadoop-dist/target/hadoop-3.0.0-SNAPSHOT mbp:hadoop-3.0.0-SNAPSHOT aw$ find . -name hadoop-functions.sh ./libexec/hadoop-functions.sh So I guess I need some advice here on a way to reproduce it and/or how to fix this one. hadoop-common-project/hadoop-common/src/main/bin/hadoop's extra quotes Yup, that's clearly broken. Probably snuck in while I was playing with the space bug. That's an easy fix. HADOOPOSTYPE needs to be documents This is one of those things that I didn't know what to do with it. I really don't want it to exist, to be honest, but with HADOOP-8719 , we have to do something in the hadoop-env.sh. Stupid Apple and/or Oracle. I think in the end we're just kind of screwed and will need to make it real. :/ I'll change it to HADOOP_OS_TYPE, define it both in hadoop-env.sh and hadoop-functions.sh so that it can be referenced as a real value. Are we planning to use *-env.sh for documenting all the variables that one may set? That was my intent, yes. I'm looking at it from the perspective of your typical /etc/default/*, /etc/sysconfig, etc, type file. Any reason populate_slaves_file function is in hadoop-config.sh and not in hadoop-functions.sh ? It was really meant as a utility function only for hadoop-config.sh to simplify the options listing/parsing code. But there's no reason it can't be moved. I'll do that as well as make it hadoop_ to keep the name space clean. The following appears to be a sort of no-op if value is not set Yup, correct. Another easy fix. (... and clearly bad copypasta from the current code ...) Any reason not to try harder and see what type -p java returns? I think it's too risky. I'd much rather have an explicit JAVA_HOME than assume that /usr/bin/java (or whatever happens to get returned) is "good". 'type -p' can also be wildly unpredictable since it uses the hashed value... this is not necessarily the first one that shows up in the path! Thanks!
          Hide
          aw Allen Wittenauer added a comment -

          9902-14 changes:

          • We now error and exit if hadoop-functions.sh isn't found in HADOOP_LIBEXEC_DIR.
          • All of the standalone exec JAVAs have been moved into a new hadoop_java_exec to be used for non-daemons. This now has an eval in front of it so that paths with spaces work again for MOST commands (most sbin/* commands are still broken, however). It also fixes Roman Shaposhnik's 2nd blocker issue.
          • HADOOP_OS_TYPE documented and made a standard var.
          • populate_slaves_file moved to functions.sh and renamed to hadoop_populate_slaves_file.
          • A few of the functions in hadoop-functions.sh are now marked as not user replaceable.
          • A few echo's changed to hadoop_errors.
          • Removed the useless empty check around HADOOP_POLICYFILE.
          • hadoop_os_tricks now uses HADOOP_OS_TYPE rather than it's custom one and the bindv6only var was made local. Also a note about HADOOP_ALLOW_IPV6 being dev-only was added in case curious eyes trying to figure out how to get past the IPv6 protection see it.

          hadoop-functions.sh is stll showing up in libexec for me with this patch, so not sure how to proceed on that issue.

          Show
          aw Allen Wittenauer added a comment - 9902-14 changes: We now error and exit if hadoop-functions.sh isn't found in HADOOP_LIBEXEC_DIR. All of the standalone exec JAVAs have been moved into a new hadoop_java_exec to be used for non-daemons. This now has an eval in front of it so that paths with spaces work again for MOST commands (most sbin/* commands are still broken, however). It also fixes Roman Shaposhnik 's 2nd blocker issue. HADOOP_OS_TYPE documented and made a standard var. populate_slaves_file moved to functions.sh and renamed to hadoop_populate_slaves_file. A few of the functions in hadoop-functions.sh are now marked as not user replaceable. A few echo's changed to hadoop_errors. Removed the useless empty check around HADOOP_POLICYFILE. hadoop_os_tricks now uses HADOOP_OS_TYPE rather than it's custom one and the bindv6only var was made local. Also a note about HADOOP_ALLOW_IPV6 being dev-only was added in case curious eyes trying to figure out how to get past the IPv6 protection see it. hadoop-functions.sh is stll showing up in libexec for me with this patch, so not sure how to proceed on that issue.
          Hide
          rvs Roman Shaposhnik added a comment -

          Allen Wittenauer here's how to repro: you just need to rebuild the source distribution tarball

          $ mvn package -Psrc -DskipTests
          $ (cd /tmp/ ; tar xzvf - ) < ./hadoop-dist/target/hadoop-2.6.0-SNAPSHOT-src.tar.gz 
          $ cd /tmp/hadoop-2.6.0*
          $ mvn -Dsnappy.prefix=x -Dbundle.snappy=true -Dsnappy.lib=/usr/lib64 -Pdist -Pnative -Psrc -Dtar -DskipTests -DskipTest -DskipITs install
          
          Show
          rvs Roman Shaposhnik added a comment - Allen Wittenauer here's how to repro: you just need to rebuild the source distribution tarball $ mvn package -Psrc -DskipTests $ (cd /tmp/ ; tar xzvf - ) < ./hadoop-dist/target/hadoop-2.6.0-SNAPSHOT-src.tar.gz $ cd /tmp/hadoop-2.6.0* $ mvn -Dsnappy.prefix=x -Dbundle.snappy=true -Dsnappy.lib=/usr/lib64 -Pdist -Pnative -Psrc -Dtar -DskipTests -DskipTest -DskipITs install
          Hide
          aw Allen Wittenauer added a comment -

          Still not seeing this in trunk on my Mac:

          $ mvn package -Psrc -DskipTests
          $ (cd /tmp/ ; tar xzvf - ) < ./hadoop-dist/target/hadoop-3.0.0-SNAPSHOT-src.tar.gz 
          $ cd /tmp/hadoop-3.0.0*
          $ mvn -Pdist -Psrc -Dtar -DskipTests -DskipTest -DskipITs  -Dtomcat.download.url=file:///Users/aw/Src/dl/apache-tomcat-6.0.36.tar.gz install 
          $ tar tvzf hadoop-dist/target/hadoop-3.0.0-SNAPSHOT.tar.gz | grep functions
          -rwxr-xr-x  0 aw     wheel   29875 Aug  8 08:04 hadoop-3.0.0-SNAPSHOT/libexec/hadoop-functions.sh
          

          I'll try this on a Linux box with the native bits enabled here in a bit.

          Show
          aw Allen Wittenauer added a comment - Still not seeing this in trunk on my Mac: $ mvn package -Psrc -DskipTests $ (cd /tmp/ ; tar xzvf - ) < ./hadoop-dist/target/hadoop-3.0.0-SNAPSHOT-src.tar.gz $ cd /tmp/hadoop-3.0.0* $ mvn -Pdist -Psrc -Dtar -DskipTests -DskipTest -DskipITs -Dtomcat.download.url=file: ///Users/aw/Src/dl/apache-tomcat-6.0.36.tar.gz install $ tar tvzf hadoop-dist/target/hadoop-3.0.0-SNAPSHOT.tar.gz | grep functions -rwxr-xr-x 0 aw wheel 29875 Aug 8 08:04 hadoop-3.0.0-SNAPSHOT/libexec/hadoop-functions.sh I'll try this on a Linux box with the native bits enabled here in a bit.
          Hide
          rvs Roman Shaposhnik added a comment -

          Allen Wittenauer hm. this is weird. I'll definitely re-try myself today. One small thing I've noticed: you're trying on trunk while I was doing it with branch-2. Not sure if that makes any difference.

          Show
          rvs Roman Shaposhnik added a comment - Allen Wittenauer hm. this is weird. I'll definitely re-try myself today. One small thing I've noticed: you're trying on trunk while I was doing it with branch-2. Not sure if that makes any difference.
          Hide
          aw Allen Wittenauer added a comment -

          Patch is built for trunk and I haven't done much testing against branch-2. So that could very well be the issue, Roman Shaposhnik. I wonder if the changes to hadoop-assemblies/src/main/resources/assemblies/hadoop-dist.xml aren't getting applied?

          I'll spin up a branch-2 as soon as this fresh trunk test finishes.

          Show
          aw Allen Wittenauer added a comment - Patch is built for trunk and I haven't done much testing against branch-2. So that could very well be the issue, Roman Shaposhnik . I wonder if the changes to hadoop-assemblies/src/main/resources/assemblies/hadoop-dist.xml aren't getting applied? I'll spin up a branch-2 as soon as this fresh trunk test finishes.
          Hide
          aw Allen Wittenauer added a comment -

          Good news? I've duplicated the issue with branch-2 and the -13 branch-2 patch. Definitely concerning this is happening. But since I'm not actually targeting that branch with this change, I'm not really inclined to spend much time on this problem. (The branch-2 patch was really just for convenience to test the code.)

          Show
          aw Allen Wittenauer added a comment - Good news? I've duplicated the issue with branch-2 and the -13 branch-2 patch. Definitely concerning this is happening. But since I'm not actually targeting that branch with this change, I'm not really inclined to spend much time on this problem. (The branch-2 patch was really just for convenience to test the code.)
          Hide
          aw Allen Wittenauer added a comment -

          OK, tracked down the bug. The -13 branch-2 patch is completely missing the hadoop-dist.xml changes.

          Show
          aw Allen Wittenauer added a comment - OK, tracked down the bug. The -13 branch-2 patch is completely missing the hadoop-dist.xml changes.
          Hide
          hadoopqa Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12660570/HADOOP-9902-14.patch
          against trunk revision .

          +1 @author. The patch does not contain any @author tags.

          -1 tests included. The patch doesn't appear to include any new or modified tests.
          Please justify why no new tests are needed for this patch.
          Also please list what manual steps were performed to verify this patch.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          +1 javadoc. There were no new javadoc warning messages.

          +1 eclipse:eclipse. The patch built with eclipse:eclipse.

          +1 findbugs. The patch does not introduce any new Findbugs (version 2.0.3) warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          -1 core tests. The patch failed these unit tests in hadoop-assemblies hadoop-common-project/hadoop-common hadoop-hdfs-project/hadoop-hdfs:

          org.apache.hadoop.security.token.delegation.web.TestWebDelegationToken
          org.apache.hadoop.hdfs.server.namenode.ha.TestPipelinesFailover

          +1 contrib tests. The patch passed contrib unit tests.

          Test results: https://builds.apache.org/job/PreCommit-HADOOP-Build/4443//testReport/
          Console output: https://builds.apache.org/job/PreCommit-HADOOP-Build/4443//console

          This message is automatically generated.

          Show
          hadoopqa Hadoop QA added a comment - -1 overall . Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12660570/HADOOP-9902-14.patch against trunk revision . +1 @author . The patch does not contain any @author tags. -1 tests included . The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. +1 javac . The applied patch does not increase the total number of javac compiler warnings. +1 javadoc . There were no new javadoc warning messages. +1 eclipse:eclipse . The patch built with eclipse:eclipse. +1 findbugs . The patch does not introduce any new Findbugs (version 2.0.3) warnings. +1 release audit . The applied patch does not increase the total number of release audit warnings. -1 core tests . The patch failed these unit tests in hadoop-assemblies hadoop-common-project/hadoop-common hadoop-hdfs-project/hadoop-hdfs: org.apache.hadoop.security.token.delegation.web.TestWebDelegationToken org.apache.hadoop.hdfs.server.namenode.ha.TestPipelinesFailover +1 contrib tests . The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-HADOOP-Build/4443//testReport/ Console output: https://builds.apache.org/job/PreCommit-HADOOP-Build/4443//console This message is automatically generated.
          Hide
          aw Allen Wittenauer added a comment -

          Test failures are obviously unrelated.

          Patch -14 deals with the issues that Roman Shaposhnik discovered.

          Show
          aw Allen Wittenauer added a comment - Test failures are obviously unrelated. Patch -14 deals with the issues that Roman Shaposhnik discovered.
          Hide
          tucu00 Alejandro Abdelnur added a comment -

          Allen Wittenauer, I was under the impression we were targeting this for branch-2? is not the case? If we don't do that, given that we don't have imminent plans to create a branch-3 out of trunk, we are at risk of getting things stale in trunk as people add changes in branch-2 only.

          Show
          tucu00 Alejandro Abdelnur added a comment - Allen Wittenauer , I was under the impression we were targeting this for branch-2? is not the case? If we don't do that, given that we don't have imminent plans to create a branch-3 out of trunk, we are at risk of getting things stale in trunk as people add changes in branch-2 only.
          Hide
          aw Allen Wittenauer added a comment -

          I was under the impression we were targeting this for branch-2? is not the case?

          It hasn't been my intention to commit this to branch-2 for a very long time. Others have expressed interest in a back port, though. Of course, while this patch definitely moves the needle the largest, there are still lots of smaller projects that need to be finished (see the blocked by list) for a comprehensive fix.

          we are at risk of getting things stale in trunk as people add changes in branch-2 only.

          There are already changes in trunk that aren't in branch-2. This would just be another one (albeit probably the biggest one). If trunk is getting 'stale', then that sounds like an issue for the PMC to take up. It doesn't really have much bearing on this patch, IMO.

          Show
          aw Allen Wittenauer added a comment - I was under the impression we were targeting this for branch-2? is not the case? It hasn't been my intention to commit this to branch-2 for a very long time. Others have expressed interest in a back port, though. Of course, while this patch definitely moves the needle the largest, there are still lots of smaller projects that need to be finished (see the blocked by list) for a comprehensive fix. we are at risk of getting things stale in trunk as people add changes in branch-2 only. There are already changes in trunk that aren't in branch-2. This would just be another one (albeit probably the biggest one). If trunk is getting 'stale', then that sounds like an issue for the PMC to take up. It doesn't really have much bearing on this patch, IMO.
          Hide
          tucu00 Alejandro Abdelnur added a comment -

          If trunk is getting 'stale', then that sounds like an issue for the PMC to take up.

          I'm being proactive on this one. I'm trying to avoid getting into that situation. I'd love to get this in, just in a way it is exercised and refined ASAP. Else, a year from now or more we'll be battling with it.

          What are the key issues to be addressed for getting this in branch-2 and how can we take care of it?

          Show
          tucu00 Alejandro Abdelnur added a comment - If trunk is getting 'stale', then that sounds like an issue for the PMC to take up. I'm being proactive on this one. I'm trying to avoid getting into that situation. I'd love to get this in, just in a way it is exercised and refined ASAP. Else, a year from now or more we'll be battling with it. What are the key issues to be addressed for getting this in branch-2 and how can we take care of it?
          Hide
          atm Aaron T. Myers added a comment -

          I agree with Allen Wittenauer on the trunk/branch-2 question. We quite clearly can't commit this patch to branch-2 because of the compat issues, at least not without some fairly substantial scaling back of this change.

          Based on some recent discussions on some of the lists, seems like the motivation for a release off of trunk (i.e. 3.x) is building. This change being only on trunk would add to the motivation to make a release from that branch.

          Show
          atm Aaron T. Myers added a comment - I agree with Allen Wittenauer on the trunk/branch-2 question. We quite clearly can't commit this patch to branch-2 because of the compat issues, at least not without some fairly substantial scaling back of this change. Based on some recent discussions on some of the lists, seems like the motivation for a release off of trunk (i.e. 3.x) is building. This change being only on trunk would add to the motivation to make a release from that branch.
          Hide
          aw Allen Wittenauer added a comment - - edited

          If there is no other technical feedback, I'll commit this on Friday given that all of Roman's issues have been addressed.

          Show
          aw Allen Wittenauer added a comment - - edited If there is no other technical feedback, I'll commit this on Friday given that all of Roman's issues have been addressed.
          Hide
          andrew.wang Andrew Wang added a comment -

          Thanks for the work here Allen, +0 from me.

          Show
          andrew.wang Andrew Wang added a comment - Thanks for the work here Allen, +0 from me.
          Hide
          arpitagarwal Arpit Agarwal added a comment -

          What are the key issues to be addressed for getting this in branch-2 and how can we take care of it?

          It would be good to know this before we commit to trunk. Thanks, Arpit.

          Show
          arpitagarwal Arpit Agarwal added a comment - What are the key issues to be addressed for getting this in branch-2 and how can we take care of it? It would be good to know this before we commit to trunk. Thanks, Arpit.
          Hide
          tucu00 Alejandro Abdelnur added a comment -

          Arpit, given the release notes, a bunch of incompatible changes. I've missed that before. So it cannot go in branch-2 as is, only trunk.

          My concern is that it will sit idle in trunk until a Hadoop 3 release. If others don't care about, well, I'm +0 on this.

          Ellen, you should have an explicit +1 before committing. Roman seems to have reviewed things in detail, I would ping him to stamp the +1.

          Show
          tucu00 Alejandro Abdelnur added a comment - Arpit, given the release notes, a bunch of incompatible changes. I've missed that before. So it cannot go in branch-2 as is, only trunk. My concern is that it will sit idle in trunk until a Hadoop 3 release. If others don't care about, well, I'm +0 on this. Ellen, you should have an explicit +1 before committing. Roman seems to have reviewed things in detail, I would ping him to stamp the +1.
          Hide
          sureshms Suresh Srinivas added a comment -

          Allen Wittenauer, I would like to review these scripts as well. Please give time till the next Wednesday (earlier if I can find time). In general this is a major rewrite. It could have been done in multiple increments in a separate jiras to help review better.

          Some high level comments - Is there any concerns you see with the existing environment in mandating bash v3? Also can you please add new functionality (jnipath, distch) to a separate jira, instead of mixing it with rewrite.

          Show
          sureshms Suresh Srinivas added a comment - Allen Wittenauer , I would like to review these scripts as well. Please give time till the next Wednesday (earlier if I can find time). In general this is a major rewrite. It could have been done in multiple increments in a separate jiras to help review better. Some high level comments - Is there any concerns you see with the existing environment in mandating bash v3? Also can you please add new functionality (jnipath, distch) to a separate jira, instead of mixing it with rewrite.
          Hide
          chris.douglas Chris Douglas added a comment -

          Suresh Srinivas: the current patch has received a fair amount of feedback and the testing will become stale. Could you complete a review this week? While thorough review could take awhile, validating the general direction should be quick. The details can be worked out as followup, if you're satisfied with the cleanup generally.

          Show
          chris.douglas Chris Douglas added a comment - Suresh Srinivas : the current patch has received a fair amount of feedback and the testing will become stale. Could you complete a review this week? While thorough review could take awhile, validating the general direction should be quick. The details can be worked out as followup, if you're satisfied with the cleanup generally.
          Hide
          sureshms Suresh Srinivas added a comment - - edited

          Chris Douglas, this is very good work by Allen Wittenauer to do the much needed cleanup. However other than Roman, I have not seen any committer review this change thoroughly and ready to +1 it. Even Roman has a bunch of caveats. Not sure if reviews can be effective where rewrite and addition of functionality all has happened together. If the only concern is this patch becoming stale, I will help in rebasing it.

          Show
          sureshms Suresh Srinivas added a comment - - edited Chris Douglas , this is very good work by Allen Wittenauer to do the much needed cleanup. However other than Roman, I have not seen any committer review this change thoroughly and ready to +1 it. Even Roman has a bunch of caveats. Not sure if reviews can be effective where rewrite and addition of functionality all has happened together. If the only concern is this patch becoming stale, I will help in rebasing it.
          Hide
          sureshms Suresh Srinivas added a comment -

          While thorough review could take awhile, validating the general direction should be quick.

          I am happy with the general direction. My concern is about the possible incompatibilities and breaking existing set of tools. Also bugs (which we can always fix them and stabilize in trunk).

          Show
          sureshms Suresh Srinivas added a comment - While thorough review could take awhile, validating the general direction should be quick. I am happy with the general direction. My concern is about the possible incompatibilities and breaking existing set of tools. Also bugs (which we can always fix them and stabilize in trunk).
          Hide
          rvs Roman Shaposhnik added a comment -

          Just to close a loop on this from my side: at this point I'm confident in this change to go into trunk. Sure we may discover small hicups, but the bulk of it is extremely solid.

          +1

          Show
          rvs Roman Shaposhnik added a comment - Just to close a loop on this from my side: at this point I'm confident in this change to go into trunk. Sure we may discover small hicups, but the bulk of it is extremely solid. +1
          Hide
          aw Allen Wittenauer added a comment - - edited

          Thanks Roman Shaposhnik!

          It could have been done in multiple increments in a separate jiras to help review better.

          Not really. As has been pointed out before, once you touch hadoop-config.sh or *-env.sh in any sort of major way, you are pretty much touching everything since all the pieces are so tightly interlocked. As a result, you'd be reviewing the entire code base almost every time.

          Additionally, the whole point of my posting of test code, changes, random discussion points, etc as I went along was so that the 30+ people who have been watching this JIRA for almost a year now could point out dumb things I did and make suggestions. Some took advantage of it and helped me get rid of some stupidity on my part, either here in the JIRA or elsewhere. I owe much gratitude to them.

          It's probably worth pointing out that a good chunk of the new features have been floating around in JIRAs in patch available status since pre-v1 days. We clearly never cared enough to review these features when they were already separate and the patches viable. This was an opportunity to bring these good ideas and operational fixes forward.

          Is there any concerns you see with the existing environment in mandating bash v3?

          Nope. Bash v3 shipped with, for example, Fedora 3 and Red Hat 8.x. These are operating systems that existed before Hadoop did. FreeBSD and NetBSD didn't, and may still not, ship bash at all as part of their core distribution. (It is, of course, in pkgsrc.) So we've been broken on them forever anyway. (Hai FreeBSD people who beat me up at conferences year after year... ) That release note is specifically for them since they always have to install bash anyway. I suppose we could always move to something like zsh which has a BSD-compatible license and then ship it with Hadoop too. (POSIX sh, BTW, is a significant performance drop.)

          (Some of the bash folks were completely surprised I made the requirement so low given that all modern OSes ship with v4.)

          My concern is about the possible incompatibilities and breaking existing set of tools.

          Again, as discussed before, this is exactly why this is going into trunk and not branch-2. I'm treating this as an incompatible change even though I suspect that the vast majority of stuff will "just work". This comes from having used a form of this code for over a year now, both secure and insecure, multiple operating systems, multiple configs, multiple types of different ways to config, Hadoop v2.0.x through trunk, single hosts and multiple hosts, talking about config with folks at conferences, running through shellcheck, etc, etc.

          To me, the biggest, most potentially breaking change is really going to be the dropping of append in -env.sh. We've only gotten away with it because we've depended upon undocumented JVM behavior. But we can't dedupe JVM flags and support append in any sort of reliable manner. Given the number of complaints, questions, and even JIRAs around "why so many Xmx's?", it's clear that append had to go.

          But to restate, yet again, this is going into trunk. Stuff may break. Hopefully not, but if we can't put incompatible changes there, we've got bigger problems.

          Show
          aw Allen Wittenauer added a comment - - edited Thanks Roman Shaposhnik ! It could have been done in multiple increments in a separate jiras to help review better. Not really. As has been pointed out before, once you touch hadoop-config.sh or *-env.sh in any sort of major way, you are pretty much touching everything since all the pieces are so tightly interlocked. As a result, you'd be reviewing the entire code base almost every time. Additionally, the whole point of my posting of test code, changes, random discussion points, etc as I went along was so that the 30+ people who have been watching this JIRA for almost a year now could point out dumb things I did and make suggestions. Some took advantage of it and helped me get rid of some stupidity on my part, either here in the JIRA or elsewhere. I owe much gratitude to them. It's probably worth pointing out that a good chunk of the new features have been floating around in JIRAs in patch available status since pre-v1 days. We clearly never cared enough to review these features when they were already separate and the patches viable. This was an opportunity to bring these good ideas and operational fixes forward. Is there any concerns you see with the existing environment in mandating bash v3? Nope. Bash v3 shipped with, for example, Fedora 3 and Red Hat 8.x. These are operating systems that existed before Hadoop did. FreeBSD and NetBSD didn't, and may still not, ship bash at all as part of their core distribution. (It is, of course, in pkgsrc.) So we've been broken on them forever anyway. (Hai FreeBSD people who beat me up at conferences year after year... ) That release note is specifically for them since they always have to install bash anyway. I suppose we could always move to something like zsh which has a BSD-compatible license and then ship it with Hadoop too. (POSIX sh, BTW, is a significant performance drop.) (Some of the bash folks were completely surprised I made the requirement so low given that all modern OSes ship with v4.) My concern is about the possible incompatibilities and breaking existing set of tools. Again, as discussed before, this is exactly why this is going into trunk and not branch-2. I'm treating this as an incompatible change even though I suspect that the vast majority of stuff will "just work". This comes from having used a form of this code for over a year now, both secure and insecure, multiple operating systems, multiple configs, multiple types of different ways to config, Hadoop v2.0.x through trunk, single hosts and multiple hosts, talking about config with folks at conferences, running through shellcheck, etc, etc. To me, the biggest, most potentially breaking change is really going to be the dropping of append in -env.sh. We've only gotten away with it because we've depended upon undocumented JVM behavior. But we can't dedupe JVM flags and support append in any sort of reliable manner. Given the number of complaints, questions, and even JIRAs around "why so many Xmx's?", it's clear that append had to go. But to restate, yet again, this is going into trunk. Stuff may break. Hopefully not, but if we can't put incompatible changes there, we've got bigger problems.
          Hide
          stevel@apache.org Steve Loughran added a comment -

          ... while I'm not a competent enough bash coder to review this at all, I agree with the strategy of commit to trunk with a view to -> branch-2 when we are happy. Putting it in trunk says to all "this is the future shell script platform" and encourages others to take it up and use, to help see where it doesn't work. as now a checkout trunk + build is all that is needed to get it on.

          I think a timeline for branch-2 merging would be good too, so that people know not to get distracted by fixes on the current scripts

          Show
          stevel@apache.org Steve Loughran added a comment - ... while I'm not a competent enough bash coder to review this at all, I agree with the strategy of commit to trunk with a view to -> branch-2 when we are happy. Putting it in trunk says to all "this is the future shell script platform" and encourages others to take it up and use, to help see where it doesn't work. as now a checkout trunk + build is all that is needed to get it on. I think a timeline for branch-2 merging would be good too, so that people know not to get distracted by fixes on the current scripts
          Hide
          sureshms Suresh Srinivas added a comment -

          I reviewed the code the best I can. I only reviewed core hadoop and hdfs changes. It is is really hard given some code formatting is mixed real improvements etc. This is a change that could have been done in a feature branch. Allen Wittenauer, certainly reviews could have been made easier that way. That said, thank you for cleaning up the scripts. It is looks much better now!

          Comments:

          1. bin/hadoop not longer checks for hdfs commands portmap and nfs3. Is this intentional?
          2. hadoop-daemon.sh usage no longer prints --hosts optional paramter in usage; this is intentional right? Also does all daemons now support option status along with start and stop?
          3. locating HADOOP_PREFIX is repeated in bin/hadoop and hadoop-daemon.sh (this can be optimized in a future patch)
          4. start-all.sh and stop-all.sh exits with warning. Why retain code after that. Expect users to delete the exit in the beginning?jj
          5. hadoop_error is not used in some cases and still echo is used.
          6. hadoop-env.sh - we should document the GC configuration for max, min, young generation starting and max size. Also think that secondary namenode should just be set to primary namenode settings. This can be done in another jira. BTW nice job for explicitly specifying the overridable functionas in hadoop-env.sh!
          7. cowsay is cute. But can get annoying . Hopefully hadoop_usage is in every script (I checked, it is).
          Show
          sureshms Suresh Srinivas added a comment - I reviewed the code the best I can. I only reviewed core hadoop and hdfs changes. It is is really hard given some code formatting is mixed real improvements etc. This is a change that could have been done in a feature branch. Allen Wittenauer , certainly reviews could have been made easier that way. That said, thank you for cleaning up the scripts. It is looks much better now! Comments: bin/hadoop not longer checks for hdfs commands portmap and nfs3. Is this intentional? hadoop-daemon.sh usage no longer prints --hosts optional paramter in usage; this is intentional right? Also does all daemons now support option status along with start and stop? locating HADOOP_PREFIX is repeated in bin/hadoop and hadoop-daemon.sh (this can be optimized in a future patch) start-all.sh and stop-all.sh exits with warning. Why retain code after that. Expect users to delete the exit in the beginning?jj hadoop_error is not used in some cases and still echo is used. hadoop-env.sh - we should document the GC configuration for max, min, young generation starting and max size. Also think that secondary namenode should just be set to primary namenode settings. This can be done in another jira. BTW nice job for explicitly specifying the overridable functionas in hadoop-env.sh! cowsay is cute. But can get annoying . Hopefully hadoop_usage is in every script (I checked, it is).
          Hide
          sureshms Suresh Srinivas added a comment -

          BTW I forgot to include the main part of my comment. +1 for the patch with the comments addressed (and comments which explicitly states things can be done in another jira can be done separately).

          Thanks Allen Wittenauer for the rewrite!

          Show
          sureshms Suresh Srinivas added a comment - BTW I forgot to include the main part of my comment. +1 for the patch with the comments addressed (and comments which explicitly states things can be done in another jira can be done separately). Thanks Allen Wittenauer for the rewrite!
          Hide
          aw Allen Wittenauer added a comment -

          bin/hadoop not longer checks for hdfs commands portmap and nfs3. Is this intentional?

          Yes. Those commands were never hooked into the hadoop command in the Apache source that I saw... but I guess I could have missed one? In any case, I didn't see a reason to have an explicit check for something that never existed as a result, especially considering how much other, actually deprecated stuff is there.

          (It could be argued that for trunk all of these deprecations should be removed since it's going to be a major release since they were put in. In other words, they were in 1.x, deprecated in 2.x, and if this is going into 3.x, we could remove them. There's some discussion on that in this jira.)

          hadoop-daemon.sh usage no longer prints --hosts optional paramter in usage; this is intentional right?

          Correct. --hosts only ever worked as far as back as I looked with hadoop-daemons.sh (plural) and related commands. The --hosts in hadoop-daemon.sh's (singular) usage was a very longstanding (and amusing) bug.

          Also does all daemons now support option status along with start and stop?

          If those daemons use the shell daemon framework (hadoop_daemon_handler, hadoop_secure_daemon_handler, etc) in hadoop-functions, yes. So, barring bugs or different functionality in the Java code, this should cover all current daemons started by yarn, hdfs, and mapred. This means kms, httpfs, etc, that haven't been converted yet unfortunately do not. I've got another jira open to rewrite those to use the new stuff.

          To cover what I suspect is the future question, if one adds a daemon following the pattern (daemon=true being the big one) to the current commands, that daemon will get the status handling and more (stop, start, logs, pids, etc) for free. This also means that if we add, e.g. 'restart', all daemons will get it too. Consolidating all of this daemon handling makes this much much easier. There is some other cleanup that should probably happen here to make it easier to add new --daemon capability though. (e.g., changing hadoop_usage everywhere is a pain.)

          locating HADOOP_PREFIX is repeated in bin/hadoop and hadoop-daemon.sh (this can be optimized in a future patch)

          It's intentional because we need to run through the initialization code to find where the hdfs command lives. Totally agree it's ugly, but with the hadoop-layout.sh code that was introduced in 0.21, we're sort of stuck here. FWIW, mapred and yarn have the same ugliness.

          start-all.sh and stop-all.sh exits with warning. Why retain code after that. Expect users to delete the exit in the beginning?

          I started to clean this up but realized it could wait. So at some point, I plan to clean this up and make it functional, esp wrt HADOOP-6590 and some... tricks. I didn't see any harm in leaving the code there for reference. Plus, as you noticed, if someone wanted to make their own, they could pull it out, delete those lines, and be on their way.

          hadoop_error is not used in some cases and still echo is used.

          Correct. hadoop_error isn't defined yet in some situations so the script has to echo to stderr manually. In particular, when the code is looking for HADOOP_LIBEXEC_DIR and the location of hadoop-functions.sh... so that it can define those functions.

          hadoop-env.sh - we should document the GC configuration for max, min, young generation starting and max size.

          This should probably be a part of HADOOP-10950. I'm going to rework the generic heap management to allow for setting Xms, get rid of JAVA_HEAP, etc. Since this is another (but thankfully smaller) "touch everything" JIRA, it'd be great if you could update that one with what you had in mind. (I think I know what you have in mind, since I suspect this reflects upon the examples I put in for NN, etc GC stuff. )

          hadoop_usage is in every script (I checked, it is).

          Shame on you for ruining my easter egg... but your check wasn't very thorough.

          Show
          aw Allen Wittenauer added a comment - bin/hadoop not longer checks for hdfs commands portmap and nfs3. Is this intentional? Yes. Those commands were never hooked into the hadoop command in the Apache source that I saw... but I guess I could have missed one? In any case, I didn't see a reason to have an explicit check for something that never existed as a result, especially considering how much other, actually deprecated stuff is there. (It could be argued that for trunk all of these deprecations should be removed since it's going to be a major release since they were put in. In other words, they were in 1.x, deprecated in 2.x, and if this is going into 3.x, we could remove them. There's some discussion on that in this jira.) hadoop-daemon.sh usage no longer prints --hosts optional paramter in usage; this is intentional right? Correct. --hosts only ever worked as far as back as I looked with hadoop-daemons.sh (plural) and related commands. The --hosts in hadoop-daemon.sh's (singular) usage was a very longstanding (and amusing) bug. Also does all daemons now support option status along with start and stop? If those daemons use the shell daemon framework (hadoop_daemon_handler, hadoop_secure_daemon_handler, etc) in hadoop-functions, yes. So, barring bugs or different functionality in the Java code, this should cover all current daemons started by yarn, hdfs, and mapred. This means kms, httpfs, etc, that haven't been converted yet unfortunately do not. I've got another jira open to rewrite those to use the new stuff. To cover what I suspect is the future question, if one adds a daemon following the pattern (daemon=true being the big one) to the current commands, that daemon will get the status handling and more (stop, start, logs, pids, etc) for free. This also means that if we add, e.g. 'restart', all daemons will get it too. Consolidating all of this daemon handling makes this much much easier. There is some other cleanup that should probably happen here to make it easier to add new --daemon capability though. (e.g., changing hadoop_usage everywhere is a pain.) locating HADOOP_PREFIX is repeated in bin/hadoop and hadoop-daemon.sh (this can be optimized in a future patch) It's intentional because we need to run through the initialization code to find where the hdfs command lives. Totally agree it's ugly, but with the hadoop-layout.sh code that was introduced in 0.21, we're sort of stuck here. FWIW, mapred and yarn have the same ugliness. start-all.sh and stop-all.sh exits with warning. Why retain code after that. Expect users to delete the exit in the beginning? I started to clean this up but realized it could wait. So at some point, I plan to clean this up and make it functional, esp wrt HADOOP-6590 and some... tricks. I didn't see any harm in leaving the code there for reference. Plus, as you noticed, if someone wanted to make their own, they could pull it out, delete those lines, and be on their way. hadoop_error is not used in some cases and still echo is used. Correct. hadoop_error isn't defined yet in some situations so the script has to echo to stderr manually. In particular, when the code is looking for HADOOP_LIBEXEC_DIR and the location of hadoop-functions.sh... so that it can define those functions. hadoop-env.sh - we should document the GC configuration for max, min, young generation starting and max size. This should probably be a part of HADOOP-10950 . I'm going to rework the generic heap management to allow for setting Xms, get rid of JAVA_HEAP, etc. Since this is another (but thankfully smaller) "touch everything" JIRA, it'd be great if you could update that one with what you had in mind. (I think I know what you have in mind, since I suspect this reflects upon the examples I put in for NN, etc GC stuff. ) hadoop_usage is in every script (I checked, it is). Shame on you for ruining my easter egg... but your check wasn't very thorough.
          Hide
          sureshms Suresh Srinivas added a comment - - edited

          Yes. Those commands were never hooked into the hadoop command in the Apache source that I saw... but I guess I could have missed one? In any case, I didn't see a reason to have an explicit check for something that never existed as a result, especially considering how much other, actually deprecated stuff is there.

          When you say hooked into hadoop command, do you mean usage? If so, that might be a bug. Brandon Li, can bin/hadoop be used to start the nfs gateway and portmap? In that case in bin/hadoop may need to include it in the case to trigger those commands using hdfs script.

          Shame on you for ruining my easter egg...

          Sorry

          but your check wasn't very thorough

          I know one now. A script named with three letters? Did I miss more?

          Show
          sureshms Suresh Srinivas added a comment - - edited Yes. Those commands were never hooked into the hadoop command in the Apache source that I saw... but I guess I could have missed one? In any case, I didn't see a reason to have an explicit check for something that never existed as a result, especially considering how much other, actually deprecated stuff is there. When you say hooked into hadoop command, do you mean usage? If so, that might be a bug. Brandon Li , can bin/hadoop be used to start the nfs gateway and portmap? In that case in bin/hadoop may need to include it in the case to trigger those commands using hdfs script. Shame on you for ruining my easter egg... Sorry but your check wasn't very thorough I know one now. A script named with three letters? Did I miss more?
          Hide
          aw Allen Wittenauer added a comment -

          I'll commit this after a jenkins run.

          -15 fixes the missing line in the copyright in hadoop-config.sh. That's sort of important...

          Show
          aw Allen Wittenauer added a comment - I'll commit this after a jenkins run. -15 fixes the missing line in the copyright in hadoop-config.sh. That's sort of important...
          Hide
          aw Allen Wittenauer added a comment - - edited

          When you say hooked into hadoop command, do you mean usage?

          Nope. I specifically mean 'hadoop portmap' and 'hadoop nfs3' never worked. The code always declared them as a deprecated command and to run hdfs instead.

          I know one now. A script named with three letters? Did I miss more?

          That's my secret.

          Show
          aw Allen Wittenauer added a comment - - edited When you say hooked into hadoop command, do you mean usage? Nope. I specifically mean 'hadoop portmap' and 'hadoop nfs3' never worked. The code always declared them as a deprecated command and to run hdfs instead. I know one now. A script named with three letters? Did I miss more? That's my secret.
          Hide
          sureshms Suresh Srinivas added a comment -

          Nope. I specifically mean 'hadoop portmap' and 'hadoop nfs3' never worked. The code always declared them as a deprecated command and to run hdfs instead.

          Doesn't the following from old script print warning and delegate nfs3 and portmap to hdfs script?

          namenode|secondarynamenode|datanode|dfs|dfsadmin|fsck|balancer|fetchdt|oiv|dfsgroups|portmap|nfs3)
              echo "DEPRECATED: Use of this script to execute hdfs command is deprecated." 1>&2
              echo "Instead use the hdfs command for it." 1>&2
              echo "" 1>&2
              #try to locate hdfs and if present, delegate to it.  
              shift
              if [ -f "${HADOOP_HDFS_HOME}"/bin/hdfs ]; then
                exec "${HADOOP_HDFS_HOME}"/bin/hdfs ${COMMAND/dfsgroups/groups}  "$@"
              elif [ -f "${HADOOP_PREFIX}"/bin/hdfs ]; then
                exec "${HADOOP_PREFIX}"/bin/hdfs ${COMMAND/dfsgroups/groups} "$@"
              else
                echo "HADOOP_HDFS_HOME not found!"
                exit 1
              fi
              ;;
          
          Show
          sureshms Suresh Srinivas added a comment - Nope. I specifically mean 'hadoop portmap' and 'hadoop nfs3' never worked. The code always declared them as a deprecated command and to run hdfs instead. Doesn't the following from old script print warning and delegate nfs3 and portmap to hdfs script? namenode|secondarynamenode|datanode|dfs|dfsadmin|fsck|balancer|fetchdt|oiv|dfsgroups|portmap|nfs3) echo "DEPRECATED: Use of this script to execute hdfs command is deprecated." 1>&2 echo "Instead use the hdfs command for it." 1>&2 echo "" 1>&2 #try to locate hdfs and if present, delegate to it. shift if [ -f "${HADOOP_HDFS_HOME}"/bin/hdfs ]; then exec "${HADOOP_HDFS_HOME}"/bin/hdfs ${COMMAND/dfsgroups/groups} "$@" elif [ -f "${HADOOP_PREFIX}"/bin/hdfs ]; then exec "${HADOOP_PREFIX}"/bin/hdfs ${COMMAND/dfsgroups/groups} "$@" else echo "HADOOP_HDFS_HOME not found!" exit 1 fi ;;
          Hide
          aw Allen Wittenauer added a comment -

          Doesn't the following from old script print warning and delegate nfs3 and portmap to hdfs script?

          It does. But if you notice, all of those other commands were in Hadoop 1.x... before the hdfs command existed. portmap and nfs3 came way way way after that. In other words, running e.g. 'hadoop portmap' as a command was never documented as valid. So the only way someone would run that would be accidentally. If we do that for every command that someone might accidentally run, we're gonna be in for a bad time.

          Show
          aw Allen Wittenauer added a comment - Doesn't the following from old script print warning and delegate nfs3 and portmap to hdfs script? It does. But if you notice, all of those other commands were in Hadoop 1.x... before the hdfs command existed. portmap and nfs3 came way way way after that. In other words, running e.g. 'hadoop portmap' as a command was never documented as valid. So the only way someone would run that would be accidentally. If we do that for every command that someone might accidentally run, we're gonna be in for a bad time.
          Hide
          aw Allen Wittenauer added a comment -

          Looks like I'm wrong:

          http://hadoop.apache.org/docs/r2.3.0/hadoop-project-dist/hadoop-hdfs/HdfsNfsGateway.html

          Why oh why did we document this using deprecated usage? I'll make a -16 that puts these back and file a jira to fix the documentation.

          Show
          aw Allen Wittenauer added a comment - Looks like I'm wrong: http://hadoop.apache.org/docs/r2.3.0/hadoop-project-dist/hadoop-hdfs/HdfsNfsGateway.html Why oh why did we document this using deprecated usage? I'll make a -16 that puts these back and file a jira to fix the documentation.
          Hide
          brandonli Brandon Li added a comment -

          ... running e.g. 'hadoop portmap' as a command was never documented as valid.

          We actually documented it in 2.3 release:
          http://hadoop.apache.org/docs/r2.3.0/hadoop-project-dist/hadoop-hdfs/HdfsNfsGateway.html

          But, we can update the NFS doc to use hdfs script instead from 3.0 onward.

          Show
          brandonli Brandon Li added a comment - ... running e.g. 'hadoop portmap' as a command was never documented as valid. We actually documented it in 2.3 release: http://hadoop.apache.org/docs/r2.3.0/hadoop-project-dist/hadoop-hdfs/HdfsNfsGateway.html But, we can update the NFS doc to use hdfs script instead from 3.0 onward.
          Hide
          aw Allen Wittenauer added a comment - - edited

          HDFS-6868 filed for the portmap and nfs3 option.

          Show
          aw Allen Wittenauer added a comment - - edited HDFS-6868 filed for the portmap and nfs3 option.
          Hide
          aw Allen Wittenauer added a comment -

          -16: re-deprecate the previously not deprecated but documented hadoop nfs3 and hadoop portmap subcommands

          Show
          aw Allen Wittenauer added a comment - -16: re-deprecate the previously not deprecated but documented hadoop nfs3 and hadoop portmap subcommands
          Hide
          aw Allen Wittenauer added a comment -

          Jenkins appears to be pretty horked. Patch clearly applies, there are no tests associated with the shell code, and previous versions applied with no issues.... so I'm just going to commit -16.

          Thanks all!

          Show
          aw Allen Wittenauer added a comment - Jenkins appears to be pretty horked. Patch clearly applies, there are no tests associated with the shell code, and previous versions applied with no issues.... so I'm just going to commit -16. Thanks all!
          Hide
          aw Allen Wittenauer added a comment -

          Commit to trunk svn rev 1618847. Closing.

          Show
          aw Allen Wittenauer added a comment - Commit to trunk svn rev 1618847. Closing.
          Hide
          hadoopqa Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12662613/HADOOP-9902-16.patch
          against trunk revision .

          +1 @author. The patch does not contain any @author tags.

          -1 tests included. The patch doesn't appear to include any new or modified tests.
          Please justify why no new tests are needed for this patch.
          Also please list what manual steps were performed to verify this patch.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          +1 javadoc. There were no new javadoc warning messages.

          +1 eclipse:eclipse. The patch built with eclipse:eclipse.

          +1 findbugs. The patch does not introduce any new Findbugs (version 2.0.3) warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          -1 core tests. The patch failed these unit tests in hadoop-assemblies hadoop-common-project/hadoop-common hadoop-hdfs-project/hadoop-hdfs:

          org.apache.hadoop.ha.TestActiveStandbyElector
          org.apache.hadoop.ha.TestZKFailoverController
          org.apache.hadoop.hdfs.server.namenode.ha.TestPipelinesFailover

          +1 contrib tests. The patch passed contrib unit tests.

          Test results: https://builds.apache.org/job/PreCommit-HADOOP-Build/4502//testReport/
          Console output: https://builds.apache.org/job/PreCommit-HADOOP-Build/4502//console

          This message is automatically generated.

          Show
          hadoopqa Hadoop QA added a comment - -1 overall . Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12662613/HADOOP-9902-16.patch against trunk revision . +1 @author . The patch does not contain any @author tags. -1 tests included . The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. +1 javac . The applied patch does not increase the total number of javac compiler warnings. +1 javadoc . There were no new javadoc warning messages. +1 eclipse:eclipse . The patch built with eclipse:eclipse. +1 findbugs . The patch does not introduce any new Findbugs (version 2.0.3) warnings. +1 release audit . The applied patch does not increase the total number of release audit warnings. -1 core tests . The patch failed these unit tests in hadoop-assemblies hadoop-common-project/hadoop-common hadoop-hdfs-project/hadoop-hdfs: org.apache.hadoop.ha.TestActiveStandbyElector org.apache.hadoop.ha.TestZKFailoverController org.apache.hadoop.hdfs.server.namenode.ha.TestPipelinesFailover +1 contrib tests . The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-HADOOP-Build/4502//testReport/ Console output: https://builds.apache.org/job/PreCommit-HADOOP-Build/4502//console This message is automatically generated.
          Hide
          hudson Hudson added a comment -

          FAILURE: Integrated in Hadoop-Yarn-trunk #651 (See https://builds.apache.org/job/Hadoop-Yarn-trunk/651/)
          HADOOP-9902. Shell script rewrite (aw) (aw: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1618847)

          • /hadoop/common/trunk/hadoop-assemblies/src/main/resources/assemblies/hadoop-dist.xml
          • /hadoop/common/trunk/hadoop-common-project/hadoop-common/CHANGES.txt
          • /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/bin/hadoop
          • /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/bin/hadoop-config.sh
          • /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/bin/hadoop-daemon.sh
          • /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/bin/hadoop-daemons.sh
          • /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/bin/hadoop-functions.sh
          • /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/bin/hadoop-layout.sh.example
          • /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/bin/rcc
          • /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/bin/slaves.sh
          • /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/bin/start-all.sh
          • /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/bin/stop-all.sh
          • /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/conf/hadoop-env.sh
          • /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/bin/distribute-exclude.sh
          • /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/bin/hdfs
          • /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/bin/hdfs-config.sh
          • /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/bin/refresh-namenodes.sh
          • /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/bin/start-balancer.sh
          • /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/bin/start-dfs.sh
          • /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/bin/start-secure-dns.sh
          • /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/bin/stop-balancer.sh
          • /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/bin/stop-dfs.sh
          • /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/bin/stop-secure-dns.sh
          • /hadoop/common/trunk/hadoop-mapreduce-project/bin/mapred
          • /hadoop/common/trunk/hadoop-mapreduce-project/bin/mapred-config.sh
          • /hadoop/common/trunk/hadoop-mapreduce-project/bin/mr-jobhistory-daemon.sh
          • /hadoop/common/trunk/hadoop-mapreduce-project/conf/mapred-env.sh
          • /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/bin/slaves.sh
          • /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/bin/start-yarn.sh
          • /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/bin/stop-yarn.sh
          • /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/bin/yarn
          • /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/bin/yarn-config.sh
          • /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/bin/yarn-daemon.sh
          • /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/bin/yarn-daemons.sh
          • /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/conf/yarn-env.sh
          Show
          hudson Hudson added a comment - FAILURE: Integrated in Hadoop-Yarn-trunk #651 (See https://builds.apache.org/job/Hadoop-Yarn-trunk/651/ ) HADOOP-9902 . Shell script rewrite (aw) (aw: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1618847 ) /hadoop/common/trunk/hadoop-assemblies/src/main/resources/assemblies/hadoop-dist.xml /hadoop/common/trunk/hadoop-common-project/hadoop-common/CHANGES.txt /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/bin/hadoop /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/bin/hadoop-config.sh /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/bin/hadoop-daemon.sh /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/bin/hadoop-daemons.sh /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/bin/hadoop-functions.sh /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/bin/hadoop-layout.sh.example /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/bin/rcc /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/bin/slaves.sh /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/bin/start-all.sh /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/bin/stop-all.sh /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/conf/hadoop-env.sh /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/bin/distribute-exclude.sh /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/bin/hdfs /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/bin/hdfs-config.sh /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/bin/refresh-namenodes.sh /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/bin/start-balancer.sh /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/bin/start-dfs.sh /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/bin/start-secure-dns.sh /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/bin/stop-balancer.sh /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/bin/stop-dfs.sh /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/bin/stop-secure-dns.sh /hadoop/common/trunk/hadoop-mapreduce-project/bin/mapred /hadoop/common/trunk/hadoop-mapreduce-project/bin/mapred-config.sh /hadoop/common/trunk/hadoop-mapreduce-project/bin/mr-jobhistory-daemon.sh /hadoop/common/trunk/hadoop-mapreduce-project/conf/mapred-env.sh /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/bin/slaves.sh /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/bin/start-yarn.sh /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/bin/stop-yarn.sh /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/bin/yarn /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/bin/yarn-config.sh /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/bin/yarn-daemon.sh /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/bin/yarn-daemons.sh /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/conf/yarn-env.sh
          Hide
          hudson Hudson added a comment -

          FAILURE: Integrated in Hadoop-trunk-Commit #6087 (See https://builds.apache.org/job/Hadoop-trunk-Commit/6087/)
          HADOOP-9902. Shell script rewrite (aw) (aw: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1618847)

          • /hadoop/common/trunk/hadoop-assemblies/src/main/resources/assemblies/hadoop-dist.xml
          • /hadoop/common/trunk/hadoop-common-project/hadoop-common/CHANGES.txt
          • /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/bin/hadoop
          • /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/bin/hadoop-config.sh
          • /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/bin/hadoop-daemon.sh
          • /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/bin/hadoop-daemons.sh
          • /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/bin/hadoop-functions.sh
          • /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/bin/hadoop-layout.sh.example
          • /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/bin/rcc
          • /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/bin/slaves.sh
          • /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/bin/start-all.sh
          • /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/bin/stop-all.sh
          • /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/conf/hadoop-env.sh
          • /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/bin/distribute-exclude.sh
          • /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/bin/hdfs
          • /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/bin/hdfs-config.sh
          • /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/bin/refresh-namenodes.sh
          • /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/bin/start-balancer.sh
          • /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/bin/start-dfs.sh
          • /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/bin/start-secure-dns.sh
          • /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/bin/stop-balancer.sh
          • /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/bin/stop-dfs.sh
          • /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/bin/stop-secure-dns.sh
          • /hadoop/common/trunk/hadoop-mapreduce-project/bin/mapred
          • /hadoop/common/trunk/hadoop-mapreduce-project/bin/mapred-config.sh
          • /hadoop/common/trunk/hadoop-mapreduce-project/bin/mr-jobhistory-daemon.sh
          • /hadoop/common/trunk/hadoop-mapreduce-project/conf/mapred-env.sh
          • /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/bin/slaves.sh
          • /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/bin/start-yarn.sh
          • /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/bin/stop-yarn.sh
          • /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/bin/yarn
          • /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/bin/yarn-config.sh
          • /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/bin/yarn-daemon.sh
          • /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/bin/yarn-daemons.sh
          • /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/conf/yarn-env.sh
          Show
          hudson Hudson added a comment - FAILURE: Integrated in Hadoop-trunk-Commit #6087 (See https://builds.apache.org/job/Hadoop-trunk-Commit/6087/ ) HADOOP-9902 . Shell script rewrite (aw) (aw: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1618847 ) /hadoop/common/trunk/hadoop-assemblies/src/main/resources/assemblies/hadoop-dist.xml /hadoop/common/trunk/hadoop-common-project/hadoop-common/CHANGES.txt /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/bin/hadoop /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/bin/hadoop-config.sh /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/bin/hadoop-daemon.sh /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/bin/hadoop-daemons.sh /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/bin/hadoop-functions.sh /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/bin/hadoop-layout.sh.example /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/bin/rcc /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/bin/slaves.sh /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/bin/start-all.sh /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/bin/stop-all.sh /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/conf/hadoop-env.sh /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/bin/distribute-exclude.sh /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/bin/hdfs /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/bin/hdfs-config.sh /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/bin/refresh-namenodes.sh /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/bin/start-balancer.sh /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/bin/start-dfs.sh /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/bin/start-secure-dns.sh /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/bin/stop-balancer.sh /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/bin/stop-dfs.sh /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/bin/stop-secure-dns.sh /hadoop/common/trunk/hadoop-mapreduce-project/bin/mapred /hadoop/common/trunk/hadoop-mapreduce-project/bin/mapred-config.sh /hadoop/common/trunk/hadoop-mapreduce-project/bin/mr-jobhistory-daemon.sh /hadoop/common/trunk/hadoop-mapreduce-project/conf/mapred-env.sh /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/bin/slaves.sh /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/bin/start-yarn.sh /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/bin/stop-yarn.sh /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/bin/yarn /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/bin/yarn-config.sh /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/bin/yarn-daemon.sh /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/bin/yarn-daemons.sh /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/conf/yarn-env.sh
          Hide
          hudson Hudson added a comment -

          FAILURE: Integrated in Hadoop-Hdfs-trunk #1842 (See https://builds.apache.org/job/Hadoop-Hdfs-trunk/1842/)
          HADOOP-9902. Shell script rewrite (aw) (aw: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1618847)

          • /hadoop/common/trunk/hadoop-assemblies/src/main/resources/assemblies/hadoop-dist.xml
          • /hadoop/common/trunk/hadoop-common-project/hadoop-common/CHANGES.txt
          • /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/bin/hadoop
          • /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/bin/hadoop-config.sh
          • /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/bin/hadoop-daemon.sh
          • /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/bin/hadoop-daemons.sh
          • /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/bin/hadoop-functions.sh
          • /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/bin/hadoop-layout.sh.example
          • /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/bin/rcc
          • /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/bin/slaves.sh
          • /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/bin/start-all.sh
          • /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/bin/stop-all.sh
          • /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/conf/hadoop-env.sh
          • /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/bin/distribute-exclude.sh
          • /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/bin/hdfs
          • /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/bin/hdfs-config.sh
          • /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/bin/refresh-namenodes.sh
          • /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/bin/start-balancer.sh
          • /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/bin/start-dfs.sh
          • /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/bin/start-secure-dns.sh
          • /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/bin/stop-balancer.sh
          • /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/bin/stop-dfs.sh
          • /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/bin/stop-secure-dns.sh
          • /hadoop/common/trunk/hadoop-mapreduce-project/bin/mapred
          • /hadoop/common/trunk/hadoop-mapreduce-project/bin/mapred-config.sh
          • /hadoop/common/trunk/hadoop-mapreduce-project/bin/mr-jobhistory-daemon.sh
          • /hadoop/common/trunk/hadoop-mapreduce-project/conf/mapred-env.sh
          • /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/bin/slaves.sh
          • /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/bin/start-yarn.sh
          • /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/bin/stop-yarn.sh
          • /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/bin/yarn
          • /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/bin/yarn-config.sh
          • /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/bin/yarn-daemon.sh
          • /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/bin/yarn-daemons.sh
          • /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/conf/yarn-env.sh
          Show
          hudson Hudson added a comment - FAILURE: Integrated in Hadoop-Hdfs-trunk #1842 (See https://builds.apache.org/job/Hadoop-Hdfs-trunk/1842/ ) HADOOP-9902 . Shell script rewrite (aw) (aw: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1618847 ) /hadoop/common/trunk/hadoop-assemblies/src/main/resources/assemblies/hadoop-dist.xml /hadoop/common/trunk/hadoop-common-project/hadoop-common/CHANGES.txt /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/bin/hadoop /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/bin/hadoop-config.sh /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/bin/hadoop-daemon.sh /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/bin/hadoop-daemons.sh /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/bin/hadoop-functions.sh /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/bin/hadoop-layout.sh.example /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/bin/rcc /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/bin/slaves.sh /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/bin/start-all.sh /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/bin/stop-all.sh /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/conf/hadoop-env.sh /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/bin/distribute-exclude.sh /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/bin/hdfs /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/bin/hdfs-config.sh /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/bin/refresh-namenodes.sh /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/bin/start-balancer.sh /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/bin/start-dfs.sh /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/bin/start-secure-dns.sh /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/bin/stop-balancer.sh /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/bin/stop-dfs.sh /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/bin/stop-secure-dns.sh /hadoop/common/trunk/hadoop-mapreduce-project/bin/mapred /hadoop/common/trunk/hadoop-mapreduce-project/bin/mapred-config.sh /hadoop/common/trunk/hadoop-mapreduce-project/bin/mr-jobhistory-daemon.sh /hadoop/common/trunk/hadoop-mapreduce-project/conf/mapred-env.sh /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/bin/slaves.sh /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/bin/start-yarn.sh /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/bin/stop-yarn.sh /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/bin/yarn /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/bin/yarn-config.sh /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/bin/yarn-daemon.sh /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/bin/yarn-daemons.sh /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/conf/yarn-env.sh
          Hide
          hudson Hudson added a comment -

          FAILURE: Integrated in Hadoop-Mapreduce-trunk #1868 (See https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1868/)
          HADOOP-9902. Shell script rewrite (aw) (aw: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1618847)

          • /hadoop/common/trunk/hadoop-assemblies/src/main/resources/assemblies/hadoop-dist.xml
          • /hadoop/common/trunk/hadoop-common-project/hadoop-common/CHANGES.txt
          • /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/bin/hadoop
          • /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/bin/hadoop-config.sh
          • /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/bin/hadoop-daemon.sh
          • /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/bin/hadoop-daemons.sh
          • /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/bin/hadoop-functions.sh
          • /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/bin/hadoop-layout.sh.example
          • /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/bin/rcc
          • /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/bin/slaves.sh
          • /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/bin/start-all.sh
          • /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/bin/stop-all.sh
          • /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/conf/hadoop-env.sh
          • /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/bin/distribute-exclude.sh
          • /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/bin/hdfs
          • /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/bin/hdfs-config.sh
          • /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/bin/refresh-namenodes.sh
          • /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/bin/start-balancer.sh
          • /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/bin/start-dfs.sh
          • /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/bin/start-secure-dns.sh
          • /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/bin/stop-balancer.sh
          • /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/bin/stop-dfs.sh
          • /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/bin/stop-secure-dns.sh
          • /hadoop/common/trunk/hadoop-mapreduce-project/bin/mapred
          • /hadoop/common/trunk/hadoop-mapreduce-project/bin/mapred-config.sh
          • /hadoop/common/trunk/hadoop-mapreduce-project/bin/mr-jobhistory-daemon.sh
          • /hadoop/common/trunk/hadoop-mapreduce-project/conf/mapred-env.sh
          • /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/bin/slaves.sh
          • /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/bin/start-yarn.sh
          • /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/bin/stop-yarn.sh
          • /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/bin/yarn
          • /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/bin/yarn-config.sh
          • /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/bin/yarn-daemon.sh
          • /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/bin/yarn-daemons.sh
          • /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/conf/yarn-env.sh
          Show
          hudson Hudson added a comment - FAILURE: Integrated in Hadoop-Mapreduce-trunk #1868 (See https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1868/ ) HADOOP-9902 . Shell script rewrite (aw) (aw: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1618847 ) /hadoop/common/trunk/hadoop-assemblies/src/main/resources/assemblies/hadoop-dist.xml /hadoop/common/trunk/hadoop-common-project/hadoop-common/CHANGES.txt /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/bin/hadoop /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/bin/hadoop-config.sh /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/bin/hadoop-daemon.sh /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/bin/hadoop-daemons.sh /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/bin/hadoop-functions.sh /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/bin/hadoop-layout.sh.example /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/bin/rcc /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/bin/slaves.sh /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/bin/start-all.sh /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/bin/stop-all.sh /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/conf/hadoop-env.sh /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/bin/distribute-exclude.sh /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/bin/hdfs /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/bin/hdfs-config.sh /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/bin/refresh-namenodes.sh /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/bin/start-balancer.sh /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/bin/start-dfs.sh /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/bin/start-secure-dns.sh /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/bin/stop-balancer.sh /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/bin/stop-dfs.sh /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/bin/stop-secure-dns.sh /hadoop/common/trunk/hadoop-mapreduce-project/bin/mapred /hadoop/common/trunk/hadoop-mapreduce-project/bin/mapred-config.sh /hadoop/common/trunk/hadoop-mapreduce-project/bin/mr-jobhistory-daemon.sh /hadoop/common/trunk/hadoop-mapreduce-project/conf/mapred-env.sh /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/bin/slaves.sh /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/bin/start-yarn.sh /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/bin/stop-yarn.sh /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/bin/yarn /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/bin/yarn-config.sh /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/bin/yarn-daemon.sh /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/bin/yarn-daemons.sh /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/conf/yarn-env.sh
          Hide
          hudson Hudson added a comment -

          FAILURE: Integrated in Hadoop-trunk-Commit #6090 (See https://builds.apache.org/job/Hadoop-trunk-Commit/6090/)
          [post-HADOOP-9902] mapred version is missing (Akira AJISAKA via aw) (aw: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1619201)

          • /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt
          • /hadoop/common/trunk/hadoop-mapreduce-project/bin/mapred
          • /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/site/apt/MapredCommands.apt.vm
          Show
          hudson Hudson added a comment - FAILURE: Integrated in Hadoop-trunk-Commit #6090 (See https://builds.apache.org/job/Hadoop-trunk-Commit/6090/ ) [post-HADOOP-9902] mapred version is missing (Akira AJISAKA via aw) (aw: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1619201 ) /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt /hadoop/common/trunk/hadoop-mapreduce-project/bin/mapred /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/site/apt/MapredCommands.apt.vm
          Hide
          jianhe Jian He added a comment -

          Allen Wittenauer, thanks for rewriting the script. I tried a couple of yarn commands and found a couple of issues.
          1. yarn command usage info seems broken. e.g. "yarn application" command earlier was printing command usage info. Now, it's throwing exception.

          WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
          Invalid Command Usage : 
          Exception in thread "main" java.lang.IllegalArgumentException: cmdLineSyntax not provided
          	at org.apache.commons.cli.HelpFormatter.printHelp(HelpFormatter.java:472)
          	at org.apache.commons.cli.HelpFormatter.printHelp(HelpFormatter.java:418)
          	at org.apache.commons.cli.HelpFormatter.printHelp(HelpFormatter.java:334)
          	at org.apache.hadoop.yarn.client.cli.ApplicationCLI.printUsage(ApplicationCLI.java:246)
          	at org.apache.hadoop.yarn.client.cli.ApplicationCLI.run(ApplicationCLI.java:234)
          	at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
          	at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:84)
          	at org.apache.hadoop.yarn.client.cli.ApplicationCLI.main(ApplicationCLI.java:76)
          

          2. Starting/stopping yarn daemon doesn't print anything any more. Earlier it was printing something like "starting resource manager..", which I think is useful.

          Also, typically for changes big as such, we should open separate jiras in YARN/MR to track YARN/MR side changes, so that it draws enough attention in YARN/MR community as well.

          Show
          jianhe Jian He added a comment - Allen Wittenauer , thanks for rewriting the script. I tried a couple of yarn commands and found a couple of issues. 1. yarn command usage info seems broken. e.g. "yarn application" command earlier was printing command usage info. Now, it's throwing exception. WARN util.NativeCodeLoader: Unable to load native -hadoop library for your platform... using builtin-java classes where applicable Invalid Command Usage : Exception in thread "main" java.lang.IllegalArgumentException: cmdLineSyntax not provided at org.apache.commons.cli.HelpFormatter.printHelp(HelpFormatter.java:472) at org.apache.commons.cli.HelpFormatter.printHelp(HelpFormatter.java:418) at org.apache.commons.cli.HelpFormatter.printHelp(HelpFormatter.java:334) at org.apache.hadoop.yarn.client.cli.ApplicationCLI.printUsage(ApplicationCLI.java:246) at org.apache.hadoop.yarn.client.cli.ApplicationCLI.run(ApplicationCLI.java:234) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:84) at org.apache.hadoop.yarn.client.cli.ApplicationCLI.main(ApplicationCLI.java:76) 2. Starting/stopping yarn daemon doesn't print anything any more. Earlier it was printing something like "starting resource manager..", which I think is useful. Also, typically for changes big as such, we should open separate jiras in YARN/MR to track YARN/MR side changes, so that it draws enough attention in YARN/MR community as well.
          Hide
          aw Allen Wittenauer added a comment -

          yarn command usage info seems broken. e.g. "yarn application" command earlier was printing command usage info. Now, it's throwing exception.

          Looks like I missed this command line stack manipulation for ApplicationCLI:

          elif [ "$COMMAND" = "application" ] ||
               [ "$COMMAND" = "applicationattempt" ] ||
               [ "$COMMAND" = "container" ]; then
            CLASS=org.apache.hadoop.yarn.client.cli.ApplicationCLI
            YARN_OPTS="$YARN_OPTS $YARN_CLIENT_OPTS"
            set -- $COMMAND $@
          

          ... probably because it is a very oddball thing to do. I'll file a JIRA for that.

          Starting/stopping yarn daemon doesn't print anything any more. Earlier it was printing something like "starting resource manager..", which I think is useful.

          Putting it inside yarn-daemon.sh or anywhere else breaks the init.d script experience for ops teams. So, if anything, this should get changed in yarn-daemons.sh and make it more of an analog to hadoop-daemons.sh.

          we should open separate jiras in YARN/MR to track YARN/MR side changes, so that it draws enough attention in YARN/MR community as well.

          It's an interesting data point to note that the follow-up JIRAs for this one to fix bugs, add a few more features, etc, are surprisingly light with watchers, if they even have any. Probably hints as to another reason why this part of the code base never gets fixes. It was decided early on (see above) to do this as one big JIRA. That was still, IMO, the correct decision based upon history and the current state.

          While this was a sweeping change across all of the subprojects, all of these individual communities should be paying attention to what is happening in common due to the dependency structure.

          Show
          aw Allen Wittenauer added a comment - yarn command usage info seems broken. e.g. "yarn application" command earlier was printing command usage info. Now, it's throwing exception. Looks like I missed this command line stack manipulation for ApplicationCLI: elif [ "$COMMAND" = "application" ] || [ "$COMMAND" = "applicationattempt" ] || [ "$COMMAND" = "container" ]; then CLASS=org.apache.hadoop.yarn.client.cli.ApplicationCLI YARN_OPTS= "$YARN_OPTS $YARN_CLIENT_OPTS" set -- $COMMAND $@ ... probably because it is a very oddball thing to do. I'll file a JIRA for that. Starting/stopping yarn daemon doesn't print anything any more. Earlier it was printing something like "starting resource manager..", which I think is useful. Putting it inside yarn-daemon.sh or anywhere else breaks the init.d script experience for ops teams. So, if anything, this should get changed in yarn-daemons.sh and make it more of an analog to hadoop-daemons.sh. we should open separate jiras in YARN/MR to track YARN/MR side changes, so that it draws enough attention in YARN/MR community as well. It's an interesting data point to note that the follow-up JIRAs for this one to fix bugs, add a few more features, etc, are surprisingly light with watchers, if they even have any. Probably hints as to another reason why this part of the code base never gets fixes. It was decided early on (see above) to do this as one big JIRA. That was still, IMO, the correct decision based upon history and the current state. While this was a sweeping change across all of the subprojects, all of these individual communities should be paying attention to what is happening in common due to the dependency structure.
          Hide
          aw Allen Wittenauer added a comment - - edited

          I've filed YARN-2436 and YARN-2437 (under the new script component I added yesterday...) for those two issues.

          Show
          aw Allen Wittenauer added a comment - - edited I've filed YARN-2436 and YARN-2437 (under the new script component I added yesterday...) for those two issues.
          Hide
          hudson Hudson added a comment -

          FAILURE: Integrated in Hadoop-Yarn-trunk #653 (See https://builds.apache.org/job/Hadoop-Yarn-trunk/653/)
          [post-HADOOP-9902] mapred version is missing (Akira AJISAKA via aw) (aw: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1619201)

          • /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt
          • /hadoop/common/trunk/hadoop-mapreduce-project/bin/mapred
          • /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/site/apt/MapredCommands.apt.vm
          Show
          hudson Hudson added a comment - FAILURE: Integrated in Hadoop-Yarn-trunk #653 (See https://builds.apache.org/job/Hadoop-Yarn-trunk/653/ ) [post-HADOOP-9902] mapred version is missing (Akira AJISAKA via aw) (aw: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1619201 ) /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt /hadoop/common/trunk/hadoop-mapreduce-project/bin/mapred /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/site/apt/MapredCommands.apt.vm
          Hide
          hudson Hudson added a comment -

          FAILURE: Integrated in Hadoop-Hdfs-trunk #1844 (See https://builds.apache.org/job/Hadoop-Hdfs-trunk/1844/)
          [post-HADOOP-9902] mapred version is missing (Akira AJISAKA via aw) (aw: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1619201)

          • /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt
          • /hadoop/common/trunk/hadoop-mapreduce-project/bin/mapred
          • /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/site/apt/MapredCommands.apt.vm
          Show
          hudson Hudson added a comment - FAILURE: Integrated in Hadoop-Hdfs-trunk #1844 (See https://builds.apache.org/job/Hadoop-Hdfs-trunk/1844/ ) [post-HADOOP-9902] mapred version is missing (Akira AJISAKA via aw) (aw: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1619201 ) /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt /hadoop/common/trunk/hadoop-mapreduce-project/bin/mapred /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/site/apt/MapredCommands.apt.vm
          Hide
          hudson Hudson added a comment -

          SUCCESS: Integrated in Hadoop-Mapreduce-trunk #1870 (See https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1870/)
          [post-HADOOP-9902] mapred version is missing (Akira AJISAKA via aw) (aw: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1619201)

          • /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt
          • /hadoop/common/trunk/hadoop-mapreduce-project/bin/mapred
          • /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/site/apt/MapredCommands.apt.vm
          Show
          hudson Hudson added a comment - SUCCESS: Integrated in Hadoop-Mapreduce-trunk #1870 (See https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1870/ ) [post-HADOOP-9902] mapred version is missing (Akira AJISAKA via aw) (aw: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1619201 ) /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt /hadoop/common/trunk/hadoop-mapreduce-project/bin/mapred /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/site/apt/MapredCommands.apt.vm
          Hide
          jianhe Jian He added a comment -

          Found one more problem that RM and NM daemon logs are now inside *.out file, instead of *.log file.

          Show
          jianhe Jian He added a comment - Found one more problem that RM and NM daemon logs are now inside *.out file, instead of *.log file.
          Hide
          jianhe Jian He added a comment -

          Also yarn daemon log file name earlier was starting with "yarn", now it's changed to "hadoop", not sure if this is an intentional change.

          Show
          jianhe Jian He added a comment - Also yarn daemon log file name earlier was starting with "yarn", now it's changed to "hadoop", not sure if this is an intentional change.
          Hide
          aw Allen Wittenauer added a comment -

          Found one more problem that RM and NM daemon logs are now inside *.out file, instead of *.log file.

          As mentioned in the release notes, YARN did a bunch of heinous stuff when it came to log4j settings, counter to the rest of Hadoop, and to much frustration with ops teams. This has been made consistent, so there is good chance you were relying upon that behavior. It could be any number of things: NM and RM _OPT settings, dependence upon the nodemanger/log4j.settings file or resourcemanager/log4j.settings file, yarn-env.sh settings, etc. You can always do 'bash -x yarn --daemon start resourcemanager'. The out file should contain the java command line.

          With the shipping *-env.sh files, you should see something similar to:

          java -Dproc_resourcemanager -Djava.net.preferIPv4Stack=true -Djava.security.krb5.realm= -Djava.security.krb5.kdc= -Djava.security.krb5.conf= -Dyarn.log.dir=/Users/aw/HADOOP/hadoop-3.0.0-SNAPSHOT/logs -Dyarn.log.file=hadoop-aw-resourcemanager-aw-mbp-work.local.log -Dyarn.home.dir=/Users/aw/HADOOP/hadoop-3.0.0-SNAPSHOT -Dyarn.root.logger=INFO,RFA -Xmx1g -Dhadoop.log.dir=/Users/aw/HADOOP/hadoop-3.0.0-SNAPSHOT/logs -Dhadoop.log.file=hadoop-aw-resourcemanager-aw-mbp-work.local.log -Dhadoop.home.dir=/Users/aw/HADOOP/hadoop-3.0.0-SNAPSHOT -Dhadoop.id.str=aw -Dhadoop.root.logger=INFO,RFA -Dhadoop.policy.file=hadoop-policy.xml -Dhadoop.security.logger=INFO,NullAppender org.apache.hadoop.yarn.server.resourcemanager.ResourceManager
          

          Also yarn daemon log file name earlier was starting with "yarn", now it's changed to "hadoop", not sure if this is an intentional change.

          Intentional. This is to make YARN consistent with the rest of the system. (Noticing a theme?)

          Show
          aw Allen Wittenauer added a comment - Found one more problem that RM and NM daemon logs are now inside *.out file, instead of *.log file. As mentioned in the release notes, YARN did a bunch of heinous stuff when it came to log4j settings, counter to the rest of Hadoop, and to much frustration with ops teams. This has been made consistent, so there is good chance you were relying upon that behavior. It could be any number of things: NM and RM _OPT settings, dependence upon the nodemanger/log4j.settings file or resourcemanager/log4j.settings file, yarn-env.sh settings, etc. You can always do 'bash -x yarn --daemon start resourcemanager'. The out file should contain the java command line. With the shipping *-env.sh files, you should see something similar to: java -Dproc_resourcemanager -Djava.net.preferIPv4Stack= true -Djava.security.krb5.realm= -Djava.security.krb5.kdc= -Djava.security.krb5.conf= -Dyarn.log.dir=/Users/aw/HADOOP/hadoop-3.0.0-SNAPSHOT/logs -Dyarn.log.file=hadoop-aw-resourcemanager-aw-mbp-work.local.log -Dyarn.home.dir=/Users/aw/HADOOP/hadoop-3.0.0-SNAPSHOT -Dyarn.root.logger=INFO,RFA -Xmx1g -Dhadoop.log.dir=/Users/aw/HADOOP/hadoop-3.0.0-SNAPSHOT/logs -Dhadoop.log.file=hadoop-aw-resourcemanager-aw-mbp-work.local.log -Dhadoop.home.dir=/Users/aw/HADOOP/hadoop-3.0.0-SNAPSHOT -Dhadoop.id.str=aw -Dhadoop.root.logger=INFO,RFA -Dhadoop.policy.file=hadoop-policy.xml -Dhadoop.security.logger=INFO,NullAppender org.apache.hadoop.yarn.server.resourcemanager.ResourceManager Also yarn daemon log file name earlier was starting with "yarn", now it's changed to "hadoop", not sure if this is an intentional change. Intentional. This is to make YARN consistent with the rest of the system. (Noticing a theme?)
          Hide
          aw Allen Wittenauer added a comment -

          ... and, just to answer the question before it gets asked...

          Want to override what the RM uses for logging? Just put this in the yarn-env.sh:

          export YARN_RESOURCEMANAGER_OPTS="-Dhadoop.root.logger=INFO,DRFA -Dyarn.root.logger=INFO,DRFA"
          

          The rest of the system will fill in the blanks.

          Show
          aw Allen Wittenauer added a comment - ... and, just to answer the question before it gets asked... Want to override what the RM uses for logging? Just put this in the yarn-env.sh: export YARN_RESOURCEMANAGER_OPTS= "-Dhadoop.root.logger=INFO,DRFA -Dyarn.root.logger=INFO,DRFA" The rest of the system will fill in the blanks.
          Hide
          hudson Hudson added a comment -

          FAILURE: Integrated in Hadoop-trunk-Commit #6093 (See https://builds.apache.org/job/Hadoop-trunk-Commit/6093/)
          YARN-2436. [post-HADOOP-9902] yarn application help doesn't work (aw: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1619603)

          • /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt
          • /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/bin/yarn
          Show
          hudson Hudson added a comment - FAILURE: Integrated in Hadoop-trunk-Commit #6093 (See https://builds.apache.org/job/Hadoop-trunk-Commit/6093/ ) YARN-2436 . [post-HADOOP-9902] yarn application help doesn't work (aw: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1619603 ) /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/bin/yarn
          Hide
          andrew.wang Andrew Wang added a comment -

          Hopefully easy question, I'm wondering how to run out of the source directory if I've built with a line like this:

          mvn clean package install -Pdist -pl hadoop-hdfs-project/hadoop-hdfs
          

          I used to source this to get hdfs and so on on my path, but now it fails looking for a dependency in a different dir:

          export HADOOP_COMMON_HOME=$(pwd)/$(ls -d hadoop-common-project/hadoop-common/target/hadoop-common-*/)
          export HADOOP_HDFS_HOME=$(pwd)/$(ls -d hadoop-hdfs-project/hadoop-hdfs/target/hadoop-hdfs-*/)
          export PATH=$HADOOP_COMMON_HOME/bin:$HADOOP_HDFS_HOME/bin:$PATH
          
          ERROR: Unable to exec /home/andrew/dev/hadoop/trunk/hadoop-hdfs-project/hadoop-hdfs/target/hadoop-hdfs-3.0.0-SNAPSHOT/bin/../libexec/hadoop-functions.sh.
          

          Is there still a way of doing this without doing a full build?

          Show
          andrew.wang Andrew Wang added a comment - Hopefully easy question, I'm wondering how to run out of the source directory if I've built with a line like this: mvn clean package install -Pdist -pl hadoop-hdfs-project/hadoop-hdfs I used to source this to get hdfs and so on on my path, but now it fails looking for a dependency in a different dir: export HADOOP_COMMON_HOME=$(pwd)/$(ls -d hadoop-common-project/hadoop-common/target/hadoop-common-*/) export HADOOP_HDFS_HOME=$(pwd)/$(ls -d hadoop-hdfs-project/hadoop-hdfs/target/hadoop-hdfs-*/) export PATH=$HADOOP_COMMON_HOME/bin:$HADOOP_HDFS_HOME/bin:$PATH ERROR: Unable to exec /home/andrew/dev/hadoop/trunk/hadoop-hdfs-project/hadoop-hdfs/target/hadoop-hdfs-3.0.0-SNAPSHOT/bin/../libexec/hadoop-functions.sh. Is there still a way of doing this without doing a full build?
          Hide
          aw Allen Wittenauer added a comment -

          Hmm.... it's basically looking for a working libexec. You need a dir with hadoop-functions.sh, hadoop-config.sh, and hdfs-config.sh (for hdfs). So you could construct that manually and point HADOOP_LIBEXEC_DIR to it and I think all would work.

          But I definitely see a bug and/or feature here, depending upon ones world view. I've filed a separate jira to cover this case (HADOOP-10996 ), as it's not an insignificant amount of work.

          Show
          aw Allen Wittenauer added a comment - Hmm.... it's basically looking for a working libexec. You need a dir with hadoop-functions.sh, hadoop-config.sh, and hdfs-config.sh (for hdfs). So you could construct that manually and point HADOOP_LIBEXEC_DIR to it and I think all would work. But I definitely see a bug and/or feature here, depending upon ones world view. I've filed a separate jira to cover this case ( HADOOP-10996 ), as it's not an insignificant amount of work.
          Hide
          hudson Hudson added a comment -

          FAILURE: Integrated in Hadoop-Yarn-trunk #654 (See https://builds.apache.org/job/Hadoop-Yarn-trunk/654/)
          YARN-2436. [post-HADOOP-9902] yarn application help doesn't work (aw: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1619603)

          • /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt
          • /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/bin/yarn
          Show
          hudson Hudson added a comment - FAILURE: Integrated in Hadoop-Yarn-trunk #654 (See https://builds.apache.org/job/Hadoop-Yarn-trunk/654/ ) YARN-2436 . [post-HADOOP-9902] yarn application help doesn't work (aw: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1619603 ) /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/bin/yarn
          Hide
          hudson Hudson added a comment -

          SUCCESS: Integrated in Hadoop-Hdfs-trunk #1845 (See https://builds.apache.org/job/Hadoop-Hdfs-trunk/1845/)
          YARN-2436. [post-HADOOP-9902] yarn application help doesn't work (aw: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1619603)

          • /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt
          • /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/bin/yarn
          Show
          hudson Hudson added a comment - SUCCESS: Integrated in Hadoop-Hdfs-trunk #1845 (See https://builds.apache.org/job/Hadoop-Hdfs-trunk/1845/ ) YARN-2436 . [post-HADOOP-9902] yarn application help doesn't work (aw: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1619603 ) /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/bin/yarn
          Hide
          hudson Hudson added a comment -

          FAILURE: Integrated in Hadoop-Mapreduce-trunk #1871 (See https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1871/)
          YARN-2436. [post-HADOOP-9902] yarn application help doesn't work (aw: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1619603)

          • /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt
          • /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/bin/yarn
          Show
          hudson Hudson added a comment - FAILURE: Integrated in Hadoop-Mapreduce-trunk #1871 (See https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1871/ ) YARN-2436 . [post-HADOOP-9902] yarn application help doesn't work (aw: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1619603 ) /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/bin/yarn
          Hide
          aw Allen Wittenauer added a comment -

          FYI: HADOOP-11002 - shell escapes are incompatible with previous releases

          Show
          aw Allen Wittenauer added a comment - FYI: HADOOP-11002 - shell escapes are incompatible with previous releases
          Hide
          aw Allen Wittenauer added a comment -

          Given bug fix JIRAs are getting little-to-no traction in watchers, much less reviews, some advice on how to proceed on what are clearly bugs (HADOOP-10996, HADOOP-11002, likely more as they come in) would be appreciated...

          Show
          aw Allen Wittenauer added a comment - Given bug fix JIRAs are getting little-to-no traction in watchers, much less reviews, some advice on how to proceed on what are clearly bugs ( HADOOP-10996 , HADOOP-11002 , likely more as they come in) would be appreciated...
          Hide
          cmccabe Colin P. McCabe added a comment -

          I reviewed HADOOP-11002.

          Show
          cmccabe Colin P. McCabe added a comment - I reviewed HADOOP-11002 .
          Hide
          andrew.wang Andrew Wang added a comment -

          Hey Allen, how do you feel about making a new umbrella JIRA to hold all these follow-on changes? It'll logically group everything together, and also avoid spamming this JIRA when something with "post-HADOOP-9902" in the message gets committed. Might also be easier for people to track too.

          Show
          andrew.wang Andrew Wang added a comment - Hey Allen, how do you feel about making a new umbrella JIRA to hold all these follow-on changes? It'll logically group everything together, and also avoid spamming this JIRA when something with "post- HADOOP-9902 " in the message gets committed. Might also be easier for people to track too.
          Hide
          aw Allen Wittenauer added a comment -

          Sure that's fine with me.

          Show
          aw Allen Wittenauer added a comment - Sure that's fine with me.
          Hide
          aw Allen Wittenauer added a comment -

          Umbrella JIRA for post- commit issues.

          Show
          aw Allen Wittenauer added a comment - Umbrella JIRA for post- commit issues.
          Hide
          aw Allen Wittenauer added a comment -

          Since people keep adding themselves here:

          All the new news is happening either on or as subtasks of HADOOP-11010.

          Show
          aw Allen Wittenauer added a comment - Since people keep adding themselves here: All the new news is happening either on or as subtasks of HADOOP-11010 .
          Hide
          aw Allen Wittenauer added a comment -

          Updated release notes to contain fixes/changes from HADOOP-11010's children.

          Show
          aw Allen Wittenauer added a comment - Updated release notes to contain fixes/changes from HADOOP-11010 's children.
          Hide
          jzhuge John Zhuge added a comment -
          Show
          jzhuge John Zhuge added a comment - The correct link to wiki: https://wiki.apache.org/hadoop/UnixShellScriptProgrammingGuide

            People

            • Assignee:
              aw Allen Wittenauer
              Reporter:
              aw Allen Wittenauer
            • Votes:
              0 Vote for this issue
              Watchers:
              52 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development