Hadoop Map/Reduce
  1. Hadoop Map/Reduce
  2. MAPREDUCE-5850

PATH environment variable contains duplicate values in map and reduce tasks on Windows.

    Details

    • Type: Bug Bug
    • Status: Resolved
    • Priority: Minor Minor
    • Resolution: Duplicate
    • Affects Version/s: 3.0.0, 2.4.0
    • Fix Version/s: None
    • Component/s: client
    • Labels:
      None

      Description

      The value of the PATH environment variable gets appended twice before execution of a container for a map or reduce task. This is ultimately harmless at runtime, but it does cause a failure in TestMiniMRChildTask when running on Windows.

        Issue Links

          Activity

          Hide
          Hadoop QA added a comment -

          +1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12640886/MAPREDUCE-5850.1.patch
          against trunk revision .

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 1 new or modified test files.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          +1 javadoc. There were no new javadoc warning messages.

          +1 eclipse:eclipse. The patch built with eclipse:eclipse.

          +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          +1 core tests. The patch passed unit tests in hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient.

          +1 contrib tests. The patch passed contrib unit tests.

          Test results: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/4536//testReport/
          Console output: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/4536//console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - +1 overall . Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12640886/MAPREDUCE-5850.1.patch against trunk revision . +1 @author . The patch does not contain any @author tags. +1 tests included . The patch appears to include 1 new or modified test files. +1 javac . The applied patch does not increase the total number of javac compiler warnings. +1 javadoc . There were no new javadoc warning messages. +1 eclipse:eclipse . The patch built with eclipse:eclipse. +1 findbugs . The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit . The applied patch does not increase the total number of release audit warnings. +1 core tests . The patch passed unit tests in hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient. +1 contrib tests . The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/4536//testReport/ Console output: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/4536//console This message is automatically generated.
          Hide
          Chris Nauroth added a comment -

          I'm going to resolve this as duplicate against MAPREDUCE-5642 and move discussion there.

          Show
          Chris Nauroth added a comment - I'm going to resolve this as duplicate against MAPREDUCE-5642 and move discussion there.
          Hide
          Chris Nauroth added a comment -

          This is only a problem on Windows. It doesn't happen on Linux. Here is a description of how this happens.

          In MRJobConfig, the default value of mapreduce.admin.user.env is defined to set the PATH environment variable on Windows so that tasks will be able to find and load hadoop.dll.

            public final String DEFAULT_MAPRED_ADMIN_USER_ENV = 
                Shell.WINDOWS ? 
                    "PATH=%PATH%;%HADOOP_COMMON_HOME%\\bin":
                    "LD_LIBRARY_PATH=$HADOOP_COMMON_HOME/lib/native";
          

          TaskAttemptImpl#createCommonContainerLaunchContext sets up the base environment. As part of that, it includes picking up mapreduce.admin.user.env. This is the point where the behavior diverges from Linux. On Linux, the common context won't have a PATH. On Windows, the common context will have a PATH.

              // Add the env variables passed by the admin
              MRApps.setEnvFromInputString(
                  environment, 
                  conf.get(
                      MRJobConfig.MAPRED_ADMIN_USER_ENV, 
                      MRJobConfig.DEFAULT_MAPRED_ADMIN_USER_ENV), conf
                  );
          

          Then, at task launch time, we end up setting PATH again via a call to TaskAttemptImpl#createContainerLaunchContext -> MapReduceChildJVM#setVMEnv -> MRApps#setEnvFromInputString -> Apps#setEnvFromInputString. This uses Apps#addToEnvironment to set the new value in the environment, and the logic of this method appends to existing values:

            @Public
            @Unstable
            public static void addToEnvironment(
                Map<String, String> environment,
                String variable, String value, String classPathSeparator) {
              String val = environment.get(variable);
              if (val == null) {
                val = value;
              } else {
                val = val + classPathSeparator + value;
              }
              environment.put(StringInterner.weakIntern(variable), 
                  StringInterner.weakIntern(val));
            }
          

          I haven't been able to come up with a clean fix for this. We can't change the default value of mapreduce.admin.user.env, because tasks are dependent on it to find the native code (an absolute must on Windows). We can't drop the appending behavior, because there are valid use cases dependent on it. Adding a special case for Windows + PATH seems hacky. Does anyone else have ideas?

          Since this is ultimately harmless, we might consider simply relaxing the assertion in TestMiniMRChildTask. I'm attaching a patch that does that. This passes on Mac and Windows.

          Show
          Chris Nauroth added a comment - This is only a problem on Windows. It doesn't happen on Linux. Here is a description of how this happens. In MRJobConfig , the default value of mapreduce.admin.user.env is defined to set the PATH environment variable on Windows so that tasks will be able to find and load hadoop.dll. public final String DEFAULT_MAPRED_ADMIN_USER_ENV = Shell.WINDOWS ? "PATH=%PATH%;%HADOOP_COMMON_HOME%\\bin" : "LD_LIBRARY_PATH=$HADOOP_COMMON_HOME/lib/ native " ; TaskAttemptImpl#createCommonContainerLaunchContext sets up the base environment. As part of that, it includes picking up mapreduce.admin.user.env . This is the point where the behavior diverges from Linux. On Linux, the common context won't have a PATH. On Windows, the common context will have a PATH. // Add the env variables passed by the admin MRApps.setEnvFromInputString( environment, conf.get( MRJobConfig.MAPRED_ADMIN_USER_ENV, MRJobConfig.DEFAULT_MAPRED_ADMIN_USER_ENV), conf ); Then, at task launch time, we end up setting PATH again via a call to TaskAttemptImpl#createContainerLaunchContext -> MapReduceChildJVM#setVMEnv -> MRApps#setEnvFromInputString -> Apps#setEnvFromInputString . This uses Apps#addToEnvironment to set the new value in the environment, and the logic of this method appends to existing values: @Public @Unstable public static void addToEnvironment( Map< String , String > environment, String variable, String value, String classPathSeparator) { String val = environment.get(variable); if (val == null ) { val = value; } else { val = val + classPathSeparator + value; } environment.put(StringInterner.weakIntern(variable), StringInterner.weakIntern(val)); } I haven't been able to come up with a clean fix for this. We can't change the default value of mapreduce.admin.user.env , because tasks are dependent on it to find the native code (an absolute must on Windows). We can't drop the appending behavior, because there are valid use cases dependent on it. Adding a special case for Windows + PATH seems hacky. Does anyone else have ideas? Since this is ultimately harmless, we might consider simply relaxing the assertion in TestMiniMRChildTask . I'm attaching a patch that does that. This passes on Mac and Windows.

            People

            • Assignee:
              Chris Nauroth
              Reporter:
              Chris Nauroth
            • Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development