Uploaded image for project: 'Hadoop Map/Reduce'
  1. Hadoop Map/Reduce
  2. MAPREDUCE-6454

MapReduce doesn't set the HADOOP_CLASSPATH for jar lib in distributed cache.

    Details

    • Type: Bug
    • Status: Closed
    • Priority: Critical
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 2.8.0, 2.7.2, 2.6.2
    • Component/s: None
    • Labels:
      None
    • Target Version/s:
    • Hadoop Flags:
      Reviewed

      Description

      We already set lib jars on distributed-cache to CLASSPATH. However, in some corner cases (like: MR local mode, Hive Map side local join, etc.), we need these jars on HADOOP_CLASSPATH so hadoop scripts can take it in launching runjar process.

      1. MAPREDUCE-6454.patch
        6 kB
        Junping Du
      2. MAPREDUCE-6454-v2.1.patch
        6 kB
        Junping Du
      3. MAPREDUCE-6454-v2.patch
        6 kB
        Junping Du
      4. MAPREDUCE-6454-v3.1.patch
        16 kB
        Junping Du
      5. MAPREDUCE-6454-v3.patch
        14 kB
        Junping Du

        Issue Links

          Activity

          Hide
          djp Junping Du added a comment -

          MRApps already include files in distributed cache when adding them to classpath env of MRApps. However, "jar"s are excluded for some reason (seems not valid now). In this patch, we add it back.

          Show
          djp Junping Du added a comment - MRApps already include files in distributed cache when adding them to classpath env of MRApps. However, "jar"s are excluded for some reason (seems not valid now). In this patch, we add it back.
          Hide
          hadoopqa Hadoop QA added a comment -



          -1 overall



          Vote Subsystem Runtime Comment
          0 pre-patch 16m 18s Pre-patch trunk compilation is healthy.
          +1 @author 0m 0s The patch does not contain any @author tags.
          +1 tests included 0m 0s The patch appears to include 1 new or modified test files.
          +1 javac 7m 56s There were no new javac warning messages.
          +1 javadoc 9m 47s There were no new javadoc warning messages.
          +1 release audit 0m 22s The applied patch does not increase the total number of release audit warnings.
          +1 checkstyle 0m 37s There were no new checkstyle issues.
          -1 whitespace 0m 0s The patch has 1 line(s) that end in whitespace. Use git apply --whitespace=fix.
          +1 install 1m 21s mvn install still works.
          +1 eclipse:eclipse 0m 33s The patch built with eclipse:eclipse.
          +1 findbugs 1m 10s The patch does not introduce any new Findbugs (version 3.0.0) warnings.
          -1 mapreduce tests 0m 43s Tests failed in hadoop-mapreduce-client-common.
              38m 51s  



          Reason Tests
          Failed unit tests hadoop.mapreduce.v2.util.TestMRApps



          Subsystem Report/Notes
          Patch URL http://issues.apache.org/jira/secure/attachment/12751170/MAPREDUCE-6454.patch
          Optional Tests javadoc javac unit findbugs checkstyle
          git revision trunk / 30e342a
          whitespace https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/5938/artifact/patchprocess/whitespace.txt
          hadoop-mapreduce-client-common test log https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/5938/artifact/patchprocess/testrun_hadoop-mapreduce-client-common.txt
          Test Results https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/5938/testReport/
          Java 1.7.0_55
          uname Linux asf907.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux
          Console output https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/5938/console

          This message was automatically generated.

          Show
          hadoopqa Hadoop QA added a comment - -1 overall Vote Subsystem Runtime Comment 0 pre-patch 16m 18s Pre-patch trunk compilation is healthy. +1 @author 0m 0s The patch does not contain any @author tags. +1 tests included 0m 0s The patch appears to include 1 new or modified test files. +1 javac 7m 56s There were no new javac warning messages. +1 javadoc 9m 47s There were no new javadoc warning messages. +1 release audit 0m 22s The applied patch does not increase the total number of release audit warnings. +1 checkstyle 0m 37s There were no new checkstyle issues. -1 whitespace 0m 0s The patch has 1 line(s) that end in whitespace. Use git apply --whitespace=fix. +1 install 1m 21s mvn install still works. +1 eclipse:eclipse 0m 33s The patch built with eclipse:eclipse. +1 findbugs 1m 10s The patch does not introduce any new Findbugs (version 3.0.0) warnings. -1 mapreduce tests 0m 43s Tests failed in hadoop-mapreduce-client-common.     38m 51s   Reason Tests Failed unit tests hadoop.mapreduce.v2.util.TestMRApps Subsystem Report/Notes Patch URL http://issues.apache.org/jira/secure/attachment/12751170/MAPREDUCE-6454.patch Optional Tests javadoc javac unit findbugs checkstyle git revision trunk / 30e342a whitespace https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/5938/artifact/patchprocess/whitespace.txt hadoop-mapreduce-client-common test log https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/5938/artifact/patchprocess/testrun_hadoop-mapreduce-client-common.txt Test Results https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/5938/testReport/ Java 1.7.0_55 uname Linux asf907.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux Console output https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/5938/console This message was automatically generated.
          Hide
          djp Junping Du added a comment -

          The test failure is because YarnConfiguration.YARN_APPLICATION_CLASSPATH is always empty string without any settings. In this case, we should use YarnConfiguration.DEFAULT_YARN_APPLICATION_CLASSPATH as default value. Fix this in v2 patch.

          Show
          djp Junping Du added a comment - The test failure is because YarnConfiguration.YARN_APPLICATION_CLASSPATH is always empty string without any settings. In this case, we should use YarnConfiguration.DEFAULT_YARN_APPLICATION_CLASSPATH as default value. Fix this in v2 patch.
          Hide
          djp Junping Du added a comment -

          Fix some whitespace issue in v2.1 patch.

          Show
          djp Junping Du added a comment - Fix some whitespace issue in v2.1 patch.
          Hide
          hadoopqa Hadoop QA added a comment -



          +1 overall



          Vote Subsystem Runtime Comment
          0 pre-patch 16m 14s Pre-patch trunk compilation is healthy.
          +1 @author 0m 0s The patch does not contain any @author tags.
          +1 tests included 0m 0s The patch appears to include 1 new or modified test files.
          +1 javac 7m 55s There were no new javac warning messages.
          +1 javadoc 10m 12s There were no new javadoc warning messages.
          +1 release audit 0m 22s The applied patch does not increase the total number of release audit warnings.
          +1 checkstyle 0m 41s There were no new checkstyle issues.
          +1 whitespace 0m 1s The patch has no lines that end in whitespace.
          +1 install 1m 22s mvn install still works.
          +1 eclipse:eclipse 0m 33s The patch built with eclipse:eclipse.
          +1 findbugs 1m 11s The patch does not introduce any new Findbugs (version 3.0.0) warnings.
          +1 mapreduce tests 0m 47s Tests passed in hadoop-mapreduce-client-common.
              39m 22s  



          Subsystem Report/Notes
          Patch URL http://issues.apache.org/jira/secure/attachment/12751255/MAPREDUCE-6454-v2.1.patch
          Optional Tests javadoc javac unit findbugs checkstyle
          git revision trunk / 22dc5fc
          hadoop-mapreduce-client-common test log https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/5940/artifact/patchprocess/testrun_hadoop-mapreduce-client-common.txt
          Test Results https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/5940/testReport/
          Java 1.7.0_55
          uname Linux asf908.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux
          Console output https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/5940/console

          This message was automatically generated.

          Show
          hadoopqa Hadoop QA added a comment - +1 overall Vote Subsystem Runtime Comment 0 pre-patch 16m 14s Pre-patch trunk compilation is healthy. +1 @author 0m 0s The patch does not contain any @author tags. +1 tests included 0m 0s The patch appears to include 1 new or modified test files. +1 javac 7m 55s There were no new javac warning messages. +1 javadoc 10m 12s There were no new javadoc warning messages. +1 release audit 0m 22s The applied patch does not increase the total number of release audit warnings. +1 checkstyle 0m 41s There were no new checkstyle issues. +1 whitespace 0m 1s The patch has no lines that end in whitespace. +1 install 1m 22s mvn install still works. +1 eclipse:eclipse 0m 33s The patch built with eclipse:eclipse. +1 findbugs 1m 11s The patch does not introduce any new Findbugs (version 3.0.0) warnings. +1 mapreduce tests 0m 47s Tests passed in hadoop-mapreduce-client-common.     39m 22s   Subsystem Report/Notes Patch URL http://issues.apache.org/jira/secure/attachment/12751255/MAPREDUCE-6454-v2.1.patch Optional Tests javadoc javac unit findbugs checkstyle git revision trunk / 22dc5fc hadoop-mapreduce-client-common test log https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/5940/artifact/patchprocess/testrun_hadoop-mapreduce-client-common.txt Test Results https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/5940/testReport/ Java 1.7.0_55 uname Linux asf908.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux Console output https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/5940/console This message was automatically generated.
          Hide
          djp Junping Du added a comment -

          Previous fixes are not right to resolve the issues we met. v3 patch is verified to be working.

          Show
          djp Junping Du added a comment - Previous fixes are not right to resolve the issues we met. v3 patch is verified to be working.
          Hide
          hadoopqa Hadoop QA added a comment -



          -1 overall



          Vote Subsystem Runtime Comment
          0 pre-patch 21m 33s Pre-patch trunk compilation is healthy.
          +1 @author 0m 0s The patch does not contain any @author tags.
          +1 tests included 0m 0s The patch appears to include 1 new or modified test files.
          -1 javac 10m 27s The applied patch generated 8 additional warning messages.
          +1 javadoc 12m 24s There were no new javadoc warning messages.
          +1 release audit 0m 24s The applied patch does not increase the total number of release audit warnings.
          -1 checkstyle 2m 22s The applied patch generated 1 new checkstyle issues (total was 26, now 27).
          -1 whitespace 0m 0s The patch has 1 line(s) that end in whitespace. Use git apply --whitespace=fix.
          +1 install 1m 37s mvn install still works.
          +1 eclipse:eclipse 0m 34s The patch built with eclipse:eclipse.
          +1 findbugs 4m 13s The patch does not introduce any new Findbugs (version 3.0.0) warnings.
          +1 mapreduce tests 9m 49s Tests passed in hadoop-mapreduce-client-app.
          +1 mapreduce tests 0m 54s Tests passed in hadoop-mapreduce-client-common.
          +1 yarn tests 0m 24s Tests passed in hadoop-yarn-api.
              64m 47s  



          Subsystem Report/Notes
          Patch URL http://issues.apache.org/jira/secure/attachment/12751510/MAPREDUCE-6454-v3.patch
          Optional Tests javadoc javac unit findbugs checkstyle
          git revision trunk / 0bc15cb
          javac https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/5944/artifact/patchprocess/diffJavacWarnings.txt
          checkstyle https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/5944/artifact/patchprocess/diffcheckstylehadoop-yarn-api.txt
          whitespace https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/5944/artifact/patchprocess/whitespace.txt
          hadoop-mapreduce-client-app test log https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/5944/artifact/patchprocess/testrun_hadoop-mapreduce-client-app.txt
          hadoop-mapreduce-client-common test log https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/5944/artifact/patchprocess/testrun_hadoop-mapreduce-client-common.txt
          hadoop-yarn-api test log https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/5944/artifact/patchprocess/testrun_hadoop-yarn-api.txt
          Test Results https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/5944/testReport/
          Java 1.7.0_55
          uname Linux asf904.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux
          Console output https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/5944/console

          This message was automatically generated.

          Show
          hadoopqa Hadoop QA added a comment - -1 overall Vote Subsystem Runtime Comment 0 pre-patch 21m 33s Pre-patch trunk compilation is healthy. +1 @author 0m 0s The patch does not contain any @author tags. +1 tests included 0m 0s The patch appears to include 1 new or modified test files. -1 javac 10m 27s The applied patch generated 8 additional warning messages. +1 javadoc 12m 24s There were no new javadoc warning messages. +1 release audit 0m 24s The applied patch does not increase the total number of release audit warnings. -1 checkstyle 2m 22s The applied patch generated 1 new checkstyle issues (total was 26, now 27). -1 whitespace 0m 0s The patch has 1 line(s) that end in whitespace. Use git apply --whitespace=fix. +1 install 1m 37s mvn install still works. +1 eclipse:eclipse 0m 34s The patch built with eclipse:eclipse. +1 findbugs 4m 13s The patch does not introduce any new Findbugs (version 3.0.0) warnings. +1 mapreduce tests 9m 49s Tests passed in hadoop-mapreduce-client-app. +1 mapreduce tests 0m 54s Tests passed in hadoop-mapreduce-client-common. +1 yarn tests 0m 24s Tests passed in hadoop-yarn-api.     64m 47s   Subsystem Report/Notes Patch URL http://issues.apache.org/jira/secure/attachment/12751510/MAPREDUCE-6454-v3.patch Optional Tests javadoc javac unit findbugs checkstyle git revision trunk / 0bc15cb javac https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/5944/artifact/patchprocess/diffJavacWarnings.txt checkstyle https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/5944/artifact/patchprocess/diffcheckstylehadoop-yarn-api.txt whitespace https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/5944/artifact/patchprocess/whitespace.txt hadoop-mapreduce-client-app test log https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/5944/artifact/patchprocess/testrun_hadoop-mapreduce-client-app.txt hadoop-mapreduce-client-common test log https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/5944/artifact/patchprocess/testrun_hadoop-mapreduce-client-common.txt hadoop-yarn-api test log https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/5944/artifact/patchprocess/testrun_hadoop-yarn-api.txt Test Results https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/5944/testReport/ Java 1.7.0_55 uname Linux asf904.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux Console output https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/5944/console This message was automatically generated.
          Hide
          djp Junping Du added a comment -

          Fix minor issues like: javadoc warnings, checkstyle, etc.

          Show
          djp Junping Du added a comment - Fix minor issues like: javadoc warnings, checkstyle, etc.
          Hide
          vinodkv Vinod Kumar Vavilapalli added a comment -

          I debugged this offline with Junping Du.

          Some offline notes

          • The original problem was tracked at MAPREDUCE-5490 for Hadoop 1, which didn't go in. Essentially, things like hive running as part of Oozie actions do not get the distributed-cache files in their classpath.
          • In Hadoop 1, the MapReduce child wouldn't set CLASSPATH and HADOOP_CLASSPATH to distributed-cache files.
          • In Hadoop 2, the situation got better, CLASSPATH is set correctly to have distributed-cache files. But HADOOP_CLASSPATH doesn't. Unfortunately, hadoop scripts don't respect CLASSPATH that is set, so we have to explicitly set HADOOP_CLASSPATH to also point to distributed cache files.

          The solution for this is to have MapReduce set distributed-cache files in HADOOP_CLASSPATH in addition to setting in CLASSPATH.

          Junping Du, the patch looks good.

          • We are not setting this for the AM, but that sounds okay.
          • We are making sure that any HADOOP_CLASSPATH already set is inherited instead of replaced - good!
          • We are only put the distributed-cache entries, and skipping MR framework etc - good again!

          The patch looks good to me. Been reviewing this offline. Will check this in if Jenkins says okay.

          Show
          vinodkv Vinod Kumar Vavilapalli added a comment - I debugged this offline with Junping Du . Some offline notes The original problem was tracked at MAPREDUCE-5490 for Hadoop 1, which didn't go in. Essentially, things like hive running as part of Oozie actions do not get the distributed-cache files in their classpath. In Hadoop 1, the MapReduce child wouldn't set CLASSPATH and HADOOP_CLASSPATH to distributed-cache files. In Hadoop 2, the situation got better, CLASSPATH is set correctly to have distributed-cache files. But HADOOP_CLASSPATH doesn't. Unfortunately, hadoop scripts don't respect CLASSPATH that is set, so we have to explicitly set HADOOP_CLASSPATH to also point to distributed cache files. The solution for this is to have MapReduce set distributed-cache files in HADOOP_CLASSPATH in addition to setting in CLASSPATH. Junping Du, the patch looks good. We are not setting this for the AM, but that sounds okay. We are making sure that any HADOOP_CLASSPATH already set is inherited instead of replaced - good! We are only put the distributed-cache entries, and skipping MR framework etc - good again! The patch looks good to me. Been reviewing this offline. Will check this in if Jenkins says okay.
          Hide
          aw Allen Wittenauer added a comment -

          What happens if the user has HADOOP_CLASSPATH set in hadoop-env.sh?

          Show
          aw Allen Wittenauer added a comment - What happens if the user has HADOOP_CLASSPATH set in hadoop-env.sh?
          Hide
          vinodkv Vinod Kumar Vavilapalli added a comment -

          What happens if the user has HADOOP_CLASSPATH set in hadoop-env.sh?

          • If the user has this on the submission node, that is not getting inherited - this is similar to what we do fro CLASSPATH.
          • If the admin sets it in the daemons, and also configures yarn to white-list those envs, we are inheriting them into the task. Again similar to CLASSPATH.
          Show
          vinodkv Vinod Kumar Vavilapalli added a comment - What happens if the user has HADOOP_CLASSPATH set in hadoop-env.sh? If the user has this on the submission node, that is not getting inherited - this is similar to what we do fro CLASSPATH. If the admin sets it in the daemons, and also configures yarn to white-list those envs, we are inheriting them into the task. Again similar to CLASSPATH.
          Hide
          aw Allen Wittenauer added a comment -

          If the admin sets it in the daemons, and also configures yarn to white-list those envs, we are inheriting them into the task. Again similar to CLASSPATH.

          If the original problem is that the bin commands don't pick up this up, this fix makes the assumption that HADOOP_CLASSPATH allows inheritance from the parent shell. The default examples in trunk specifically don't do this to prevent the double settings problem that plagues prior releases.

          In fact, since HADOOP_CLASSPATH is really intended for users to use, why are we overloading it? For trunk at least, it would probably be better to have a different var that is handled via mapreduce's shellprofile.d bit.

          Show
          aw Allen Wittenauer added a comment - If the admin sets it in the daemons, and also configures yarn to white-list those envs, we are inheriting them into the task. Again similar to CLASSPATH. If the original problem is that the bin commands don't pick up this up, this fix makes the assumption that HADOOP_CLASSPATH allows inheritance from the parent shell. The default examples in trunk specifically don't do this to prevent the double settings problem that plagues prior releases. In fact, since HADOOP_CLASSPATH is really intended for users to use, why are we overloading it? For trunk at least, it would probably be better to have a different var that is handled via mapreduce's shellprofile.d bit.
          Hide
          hadoopqa Hadoop QA added a comment -



          -1 overall



          Vote Subsystem Runtime Comment
          0 pre-patch 18m 55s Pre-patch trunk compilation is healthy.
          +1 @author 0m 0s The patch does not contain any @author tags.
          +1 tests included 0m 0s The patch appears to include 1 new or modified test files.
          +1 javac 7m 52s There were no new javac warning messages.
          +1 javadoc 9m 45s There were no new javadoc warning messages.
          +1 release audit 0m 23s The applied patch does not increase the total number of release audit warnings.
          +1 checkstyle 1m 57s There were no new checkstyle issues.
          -1 whitespace 0m 1s The patch has 1 line(s) that end in whitespace. Use git apply --whitespace=fix.
          +1 install 1m 27s mvn install still works.
          +1 eclipse:eclipse 0m 34s The patch built with eclipse:eclipse.
          +1 findbugs 3m 52s The patch does not introduce any new Findbugs (version 3.0.0) warnings.
          +1 mapreduce tests 9m 27s Tests passed in hadoop-mapreduce-client-app.
          +1 mapreduce tests 0m 48s Tests passed in hadoop-mapreduce-client-common.
          +1 yarn tests 0m 24s Tests passed in hadoop-yarn-api.
              55m 30s  



          Subsystem Report/Notes
          Patch URL http://issues.apache.org/jira/secure/attachment/12751583/MAPREDUCE-6454-v3.1.patch
          Optional Tests javadoc javac unit findbugs checkstyle
          git revision trunk / 7642f64
          whitespace https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/5948/artifact/patchprocess/whitespace.txt
          hadoop-mapreduce-client-app test log https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/5948/artifact/patchprocess/testrun_hadoop-mapreduce-client-app.txt
          hadoop-mapreduce-client-common test log https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/5948/artifact/patchprocess/testrun_hadoop-mapreduce-client-common.txt
          hadoop-yarn-api test log https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/5948/artifact/patchprocess/testrun_hadoop-yarn-api.txt
          Test Results https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/5948/testReport/
          Java 1.7.0_55
          uname Linux asf903.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux
          Console output https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/5948/console

          This message was automatically generated.

          Show
          hadoopqa Hadoop QA added a comment - -1 overall Vote Subsystem Runtime Comment 0 pre-patch 18m 55s Pre-patch trunk compilation is healthy. +1 @author 0m 0s The patch does not contain any @author tags. +1 tests included 0m 0s The patch appears to include 1 new or modified test files. +1 javac 7m 52s There were no new javac warning messages. +1 javadoc 9m 45s There were no new javadoc warning messages. +1 release audit 0m 23s The applied patch does not increase the total number of release audit warnings. +1 checkstyle 1m 57s There were no new checkstyle issues. -1 whitespace 0m 1s The patch has 1 line(s) that end in whitespace. Use git apply --whitespace=fix. +1 install 1m 27s mvn install still works. +1 eclipse:eclipse 0m 34s The patch built with eclipse:eclipse. +1 findbugs 3m 52s The patch does not introduce any new Findbugs (version 3.0.0) warnings. +1 mapreduce tests 9m 27s Tests passed in hadoop-mapreduce-client-app. +1 mapreduce tests 0m 48s Tests passed in hadoop-mapreduce-client-common. +1 yarn tests 0m 24s Tests passed in hadoop-yarn-api.     55m 30s   Subsystem Report/Notes Patch URL http://issues.apache.org/jira/secure/attachment/12751583/MAPREDUCE-6454-v3.1.patch Optional Tests javadoc javac unit findbugs checkstyle git revision trunk / 7642f64 whitespace https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/5948/artifact/patchprocess/whitespace.txt hadoop-mapreduce-client-app test log https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/5948/artifact/patchprocess/testrun_hadoop-mapreduce-client-app.txt hadoop-mapreduce-client-common test log https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/5948/artifact/patchprocess/testrun_hadoop-mapreduce-client-common.txt hadoop-yarn-api test log https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/5948/artifact/patchprocess/testrun_hadoop-yarn-api.txt Test Results https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/5948/testReport/ Java 1.7.0_55 uname Linux asf903.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux Console output https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/5948/console This message was automatically generated.
          Hide
          vinodkv Vinod Kumar Vavilapalli added a comment -

          If the original problem is that the bin commands don't pick up this up, this fix makes the assumption that HADOOP_CLASSPATH allows inheritance from the parent shell. The default examples in trunk specifically don't do this to prevent the double settings problem that plagues prior releases.

          The original problem is that the commands don't pick up CLASSPATH set by users on the shell - Owen tells me offline that this went long back.

          In fact, since HADOOP_CLASSPATH is really intended for users to use, why are we overloading it? For trunk at least, it would probably be better to have a different var that is handled via mapreduce's shellprofile.d bit.

          The specific scenario here user is spawning bin/hadoop commands within his/her tasks - so this blurs the lines between interactive vs non-interactive usecases - and the user is expecting distributed-cache to be accessed in the spawned shells.

          I am going to get this in only into branch-2 and branch-2.7/2.6. We will have to think more about the right-approach for trunk. Will open a separate ticket for this.

          -1 whitespace 0m 1s The patch has 1 line(s) that end in whitespace. Use git apply --whitespace=fix.

          Will do jenkins. Ty.

          Show
          vinodkv Vinod Kumar Vavilapalli added a comment - If the original problem is that the bin commands don't pick up this up, this fix makes the assumption that HADOOP_CLASSPATH allows inheritance from the parent shell. The default examples in trunk specifically don't do this to prevent the double settings problem that plagues prior releases. The original problem is that the commands don't pick up CLASSPATH set by users on the shell - Owen tells me offline that this went long back. In fact, since HADOOP_CLASSPATH is really intended for users to use, why are we overloading it? For trunk at least, it would probably be better to have a different var that is handled via mapreduce's shellprofile.d bit. The specific scenario here user is spawning bin/hadoop commands within his/her tasks - so this blurs the lines between interactive vs non-interactive usecases - and the user is expecting distributed-cache to be accessed in the spawned shells. I am going to get this in only into branch-2 and branch-2.7/2.6. We will have to think more about the right-approach for trunk. Will open a separate ticket for this. -1 whitespace 0m 1s The patch has 1 line(s) that end in whitespace. Use git apply --whitespace=fix. Will do jenkins. Ty.
          Hide
          vinodkv Vinod Kumar Vavilapalli added a comment -

          Committed this to trunk, branch-2, 2.7 and 2.6. Thanks Junping.

          Junping Du, can we please followup on the trunk changes? Thanks..

          Show
          vinodkv Vinod Kumar Vavilapalli added a comment - Committed this to trunk, branch-2, 2.7 and 2.6. Thanks Junping. Junping Du , can we please followup on the trunk changes? Thanks..
          Hide
          djp Junping Du added a comment -

          Thanks Vinod Kumar Vavilapalli for review/commit and Allen Wittenauer for comments.

          For trunk at least, it would probably be better to have a different var that is handled via mapreduce's shellprofile.d bit.

          This also sounds like a good way. Agree that we can discuss more later (on another JIRA) for trunk.

          We will have to think more about the right-approach for trunk. Will open a separate ticket for this.

          +1. Just filed MAPREDUCE-6458 to address this.

          Show
          djp Junping Du added a comment - Thanks Vinod Kumar Vavilapalli for review/commit and Allen Wittenauer for comments. For trunk at least, it would probably be better to have a different var that is handled via mapreduce's shellprofile.d bit. This also sounds like a good way. Agree that we can discuss more later (on another JIRA) for trunk. We will have to think more about the right-approach for trunk. Will open a separate ticket for this. +1. Just filed MAPREDUCE-6458 to address this.
          Hide
          aw Allen Wittenauer added a comment - - edited

          Just so it's on record so when someone hits this problem: this is fragile and subject to breakage, regardless of the version of hadoop in play. It all depends upon how users have HADOOP_CLASSPATH configured in hadoop-env.sh and yarn-env.sh.

          Show
          aw Allen Wittenauer added a comment - - edited Just so it's on record so when someone hits this problem: this is fragile and subject to breakage, regardless of the version of hadoop in play. It all depends upon how users have HADOOP_CLASSPATH configured in hadoop-env.sh and yarn-env.sh.
          Hide
          vinodkv Vinod Kumar Vavilapalli added a comment -

          Just so it's on record so when someone hits this problem: this is fragile and subject to breakage, regardless of the version of hadoop in play. It all depends upon how users have HADOOP_CLASSPATH configured in hadoop-env.sh and yarn-env.sh.

          It is a bit fragile, for sure, but it doesn't by default depend on what is configured in *-env.sh like you said. This is because HADOOP_CLASSPATH is not part of the default white-listed environment that goes from YARN to the apps.

          Show
          vinodkv Vinod Kumar Vavilapalli added a comment - Just so it's on record so when someone hits this problem: this is fragile and subject to breakage, regardless of the version of hadoop in play. It all depends upon how users have HADOOP_CLASSPATH configured in hadoop-env.sh and yarn-env.sh. It is a bit fragile, for sure, but it doesn't by default depend on what is configured in *-env.sh like you said. This is because HADOOP_CLASSPATH is not part of the default white-listed environment that goes from YARN to the apps.
          Hide
          aw Allen Wittenauer added a comment -

          This is because HADOOP_CLASSPATH is not part of the default white-listed environment that goes from YARN to the apps.

          If I have HADOOP_CLASSPATH="foo" in hadoop-env.sh, when I run a shell command (say "hadoop version") as part of my app, that's going to overwrite whatever Hadoop tries to set it to. The whitelist is completely irrelevant.

          Show
          aw Allen Wittenauer added a comment - This is because HADOOP_CLASSPATH is not part of the default white-listed environment that goes from YARN to the apps. If I have HADOOP_CLASSPATH="foo" in hadoop-env.sh, when I run a shell command (say "hadoop version") as part of my app, that's going to overwrite whatever Hadoop tries to set it to. The whitelist is completely irrelevant.
          Hide
          aw Allen Wittenauer added a comment -

          Here's a test you can do to prove my point.

          $ echo "HADOOP_CLASSPATH=/tmp" >> $HADOOP_CONF_DIR/hadoop-env.sh
          $ hadoop classpath

          You should see /tmp

          $ HADOOP_CLASSPATH=/etc hadoop classpath

          You'll still see /tmp. You won't see /etc. (Well, unless your classpath is really weird already.)

          Show
          aw Allen Wittenauer added a comment - Here's a test you can do to prove my point. $ echo "HADOOP_CLASSPATH=/tmp" >> $HADOOP_CONF_DIR/hadoop-env.sh $ hadoop classpath You should see /tmp $ HADOOP_CLASSPATH=/etc hadoop classpath You'll still see /tmp. You won't see /etc. (Well, unless your classpath is really weird already.)
          Hide
          shanyu shanyu zhao added a comment -

          The patch in this JIRA caused regression. Previously the MR container inherit HADOOP_CLASSPATH from system environment, now it completely replaced it. I think we need to add this line to the end of setClasspath() in MRApps.java:

              MRApps.addToEnvironment(
                  environment,
                  hadoopClasspathEnvVar,
                  System.getenv(hadoopClasspathEnvVar),
                  conf);
          
          Show
          shanyu shanyu zhao added a comment - The patch in this JIRA caused regression. Previously the MR container inherit HADOOP_CLASSPATH from system environment, now it completely replaced it. I think we need to add this line to the end of setClasspath() in MRApps.java: MRApps.addToEnvironment( environment, hadoopClasspathEnvVar, System .getenv(hadoopClasspathEnvVar), conf);
          Hide
          djp Junping Du added a comment -

          MAPREDUCE-6619 is filed to track/fix this issue. Thanks shanyu zhao for reporting this issue.

          Show
          djp Junping Du added a comment - MAPREDUCE-6619 is filed to track/fix this issue. Thanks shanyu zhao for reporting this issue.
          Hide
          andrew.wang Andrew Wang added a comment -

          Hi folks, I noticed this JIRA is not present in trunk, though Vinod's comment says:

          Committed this to trunk, branch-2, 2.7 and 2.6. Thanks Junping.

          Based on the above discussion, I think this was only intended for branch-2 and thus git is correct, but I would appreciate clarification.

          Show
          andrew.wang Andrew Wang added a comment - Hi folks, I noticed this JIRA is not present in trunk, though Vinod's comment says: Committed this to trunk, branch-2, 2.7 and 2.6. Thanks Junping. Based on the above discussion, I think this was only intended for branch-2 and thus git is correct, but I would appreciate clarification.
          Hide
          djp Junping Du added a comment -

          Andrew Wang, Yes. this and MAPREDUCE-6619 only for branch-2 now. For trunk, Allen want to do something different but I am not sure the progress there. May be we should consider to port this and MAPREDUCE-6619 to trunk as well?

          Show
          djp Junping Du added a comment - Andrew Wang , Yes. this and MAPREDUCE-6619 only for branch-2 now. For trunk, Allen want to do something different but I am not sure the progress there. May be we should consider to port this and MAPREDUCE-6619 to trunk as well?
          Hide
          aw Allen Wittenauer added a comment -

          This patch makes a huge assumption about the shell scripts that is no longer true in trunk (and my attempts to teach bash in this jira failed). In the end, users who use hadoop, hdfs, yarn, and mapred commands in their code will discover they don't have distributed cache jars in their classpath.

          Show
          aw Allen Wittenauer added a comment - This patch makes a huge assumption about the shell scripts that is no longer true in trunk (and my attempts to teach bash in this jira failed). In the end, users who use hadoop, hdfs, yarn, and mapred commands in their code will discover they don't have distributed cache jars in their classpath.

            People

            • Assignee:
              djp Junping Du
              Reporter:
              djp Junping Du
            • Votes:
              0 Vote for this issue
              Watchers:
              13 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development