Hadoop Common
  1. Hadoop Common
  2. HADOOP-9422

HADOOP_HOME should not be required to be set to be able to launch commands using hadoop.util.Shell

    Details

    • Type: Bug Bug
    • Status: Patch Available
    • Priority: Major Major
    • Resolution: Unresolved
    • Affects Version/s: 3.0.0
    • Fix Version/s: None
    • Component/s: None
    • Labels:
      None

      Description

      Not sure why this is an enforced requirement especially in cases where a deployment is done using multiple tar-balls ( one each for common/hdfs/mapreduce/yarn ).

        Issue Links

          Activity

          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12586372/HADOOP-9422.patch
          against trunk revision .

          +1 @author. The patch does not contain any @author tags.

          -1 tests included. The patch doesn't appear to include any new or modified tests.
          Please justify why no new tests are needed for this patch.
          Also please list what manual steps were performed to verify this patch.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          +1 javadoc. The javadoc tool did not generate any warning messages.

          +1 eclipse:eclipse. The patch built with eclipse:eclipse.

          +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          +1 core tests. The patch passed unit tests in hadoop-common-project/hadoop-common.

          +1 contrib tests. The patch passed contrib unit tests.

          Test results: https://builds.apache.org/job/PreCommit-HADOOP-Build/2601//testReport/
          Console output: https://builds.apache.org/job/PreCommit-HADOOP-Build/2601//console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall . Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12586372/HADOOP-9422.patch against trunk revision . +1 @author . The patch does not contain any @author tags. -1 tests included . The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. +1 javac . The applied patch does not increase the total number of javac compiler warnings. +1 javadoc . The javadoc tool did not generate any warning messages. +1 eclipse:eclipse . The patch built with eclipse:eclipse. +1 findbugs . The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit . The applied patch does not increase the total number of release audit warnings. +1 core tests . The patch passed unit tests in hadoop-common-project/hadoop-common. +1 contrib tests . The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-HADOOP-Build/2601//testReport/ Console output: https://builds.apache.org/job/PreCommit-HADOOP-Build/2601//console This message is automatically generated.
          Hide
          Arpit Gupta added a comment -

          I can take this up.

          One thing we could do is not have the static variable defined but call the appropriate method when needed. I will upload a patch for that. Also it looks like getHadoopHome is not being used in the project and getQualifiedBinPath is being used in getWinUtilsPath only when windows env is there.

          So we would also remove getHadoopHome and leave the getQualifiedBinPath there.

          Show
          Arpit Gupta added a comment - I can take this up. One thing we could do is not have the static variable defined but call the appropriate method when needed. I will upload a patch for that. Also it looks like getHadoopHome is not being used in the project and getQualifiedBinPath is being used in getWinUtilsPath only when windows env is there. So we would also remove getHadoopHome and leave the getQualifiedBinPath there.
          Hide
          Enis Soztutar added a comment -

          The requirement as far as I can tell, was that windows support needs to find a native library winutils, and it looks under hadoop home. +1 on making it a soft dependency on non-windows playforms, and in a further issue change the winutils to be under hadoop native lib dir.

          Show
          Enis Soztutar added a comment - The requirement as far as I can tell, was that windows support needs to find a native library winutils, and it looks under hadoop home. +1 on making it a soft dependency on non-windows playforms, and in a further issue change the winutils to be under hadoop native lib dir.
          Hide
          Chris Nauroth added a comment -

          When we fix this, we may also need to make a change in TestUserGroupInformation. (See HADOOP-9500.)

          Show
          Chris Nauroth added a comment - When we fix this, we may also need to make a change in TestUserGroupInformation . (See HADOOP-9500 .)
          Hide
          Steve Loughran added a comment -

          also, because it is at error, the only way to stop it appearing in the logs is to add a line to the log to suppress all error level events

          log4j.logger.org.apache.hadoop.util.Shell=FATAL
          

          This can actually hide warning messages that are in fact significant, namely the failure of things to exec properly

          Show
          Steve Loughran added a comment - also, because it is at error, the only way to stop it appearing in the logs is to add a line to the log to suppress all error level events log4j.logger.org.apache.hadoop.util.Shell=FATAL This can actually hide warning messages that are in fact significant, namely the failure of things to exec properly
          Hide
          Steve Loughran added a comment -

          We should also have the Shell code not log at error if the stuff is missing, not until it is needed. Otherwise you get stack traces at ERROR all the time when you try to run local pig jobs

          2013-04-19 11:55:40,219 [main] ERROR util.Shell (Shell.java:checkHadoopHome(180)) - Failed to detect a valid hadoop home directory
          java.io.IOException: HADOOP_HOME or hadoop.home.dir are not set.
          	at org.apache.hadoop.util.Shell.checkHadoopHome(Shell.java:163)
          	at org.apache.hadoop.util.Shell.<clinit>(Shell.java:186)
          	at org.apache.hadoop.util.StringUtils.<clinit>(StringUtils.java:77)
          	at org.apache.hadoop.mapreduce.lib.input.FileInputFormat.setInputPaths(FileInputFormat.java:339)
          	at org.apache.pig.builtin.PigStorage.setLocation(PigStorage.java:374)
          	at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler.getJob(JobControlCompiler.java:473)
          	at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler.compile(JobControlCompiler.java:294)
          	at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher.launchPig(MapReduceLauncher.java:177)
          	at org.apache.pig.PigServer.launchPlan(PigServer.java:1264)
          	at org.apache.pig.PigServer.executeCompiledLogicalPlan(PigServer.java:1249)
          	at org.apache.pig.PigServer.execute(PigServer.java:1239)
          	at org.apache.pig.PigServer.access$400(PigServer.java:121)
          	at org.apache.pig.PigServer$Graph.registerQuery(PigServer.java:1553)
          	at org.apache.pig.PigServer.registerQuery(PigServer.java:516)
          	at org.apache.pig.tools.grunt.GruntParser.processPig(GruntParser.java:991)
          	at org.apache.pig.tools.pigscript.parser.PigScriptParser.parse(PigScriptParser.java:412)
          	at org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:194)
          	at org.apache.pig.PigServer.registerScript(PigServer.java:590)
          	at org.apache.pig.PigServer$registerScript.call(Unknown Source)
          	at org.codehaus.groovy.runtime.callsite.CallSiteArray.defaultCall(CallSiteArray.java:42)
          	at org.codehaus.groovy.runtime.callsite.AbstractCallSite.call(AbstractCallSite.java:108)
          	at org.codehaus.groovy.runtime.callsite.AbstractCallSite.call(AbstractCallSite.java:124)
          	at org.apache.hadoop.fs.swift.integration.IntegrationTestBase.registerPigResource(IntegrationTestBase.groovy:196)
          	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
          	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
          	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
          	at java.lang.reflect.Method.invoke(Method.java:597)
          	at org.codehaus.groovy.runtime.callsite.PogoMetaMethodSite$PogoCachedMethodSiteNoUnwrapNoCoerce.invoke(PogoMetaMethodSite.java:267)
          	at org.codehaus.groovy.runtime.callsite.PogoMetaMethodSite.callCurrent(PogoMetaMethodSite.java:52)
          	at org.codehaus.groovy.runtime.callsite.CallSiteArray.defaultCallCurrent(CallSiteArray.java:46)
          	at org.codehaus.groovy.runtime.callsite.AbstractCallSite.callCurrent(AbstractCallSite.java:133)
          	at org.codehaus.groovy.runtime.callsite.AbstractCallSite.callCurrent(AbstractCallSite.java:149)
          	at org.apache.hadoop.fs.swift.integration.pig.TestPig.testLoadGeneratedData(TestPig.groovy:48)
          	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
          	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
          	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
          	at java.lang.reflect.Method.invoke(Method.java:597)
          	at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:45)
          	at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:15)
          	at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:42)
          	at org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:20)
          	at org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:28)
          	at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:263)
          	at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:68)
          	at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:47)
          	at org.junit.runners.ParentRunner$3.run(ParentRunner.java:231)
          	at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:60)
          	at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:229)
          	at org.junit.runners.ParentRunner.access$000(ParentRunner.java:50)
          	at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:222)
          	at org.junit.runners.ParentRunner.run(ParentRunner.java:300)
          	at org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:264)
          	at org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:153)
          	at org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:124)
          	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
          	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
          	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
          	at java.lang.reflect.Method.invoke(Method.java:597)
          	at org.apache.maven.surefire.util.ReflectionUtils.invokeMethodWithArray2(ReflectionUtils.java:208)
          	at org.apache.maven.surefire.booter.ProviderFactory$ProviderProxy.invoke(ProviderFactory.java:158)
          	at org.apache.maven.surefire.booter.ProviderFactory.invokeProvider(ProviderFactory.java:86)
          	at org.apache.maven.surefire.booter.ForkedBooter.runSuitesInProcess(ForkedBooter.java:153)
          	at org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:95)
          

          It's not an error, and including the stack is irrelevant most of the time, so it just creates extraneous noise in the logs -and, being logged at ERROR, confusing noise.

          Show
          Steve Loughran added a comment - We should also have the Shell code not log at error if the stuff is missing, not until it is needed. Otherwise you get stack traces at ERROR all the time when you try to run local pig jobs 2013-04-19 11:55:40,219 [main] ERROR util.Shell (Shell.java:checkHadoopHome(180)) - Failed to detect a valid hadoop home directory java.io.IOException: HADOOP_HOME or hadoop.home.dir are not set. at org.apache.hadoop.util.Shell.checkHadoopHome(Shell.java:163) at org.apache.hadoop.util.Shell.<clinit>(Shell.java:186) at org.apache.hadoop.util.StringUtils.<clinit>(StringUtils.java:77) at org.apache.hadoop.mapreduce.lib.input.FileInputFormat.setInputPaths(FileInputFormat.java:339) at org.apache.pig.builtin.PigStorage.setLocation(PigStorage.java:374) at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler.getJob(JobControlCompiler.java:473) at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler.compile(JobControlCompiler.java:294) at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher.launchPig(MapReduceLauncher.java:177) at org.apache.pig.PigServer.launchPlan(PigServer.java:1264) at org.apache.pig.PigServer.executeCompiledLogicalPlan(PigServer.java:1249) at org.apache.pig.PigServer.execute(PigServer.java:1239) at org.apache.pig.PigServer.access$400(PigServer.java:121) at org.apache.pig.PigServer$Graph.registerQuery(PigServer.java:1553) at org.apache.pig.PigServer.registerQuery(PigServer.java:516) at org.apache.pig.tools.grunt.GruntParser.processPig(GruntParser.java:991) at org.apache.pig.tools.pigscript.parser.PigScriptParser.parse(PigScriptParser.java:412) at org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:194) at org.apache.pig.PigServer.registerScript(PigServer.java:590) at org.apache.pig.PigServer$registerScript.call(Unknown Source) at org.codehaus.groovy.runtime.callsite.CallSiteArray.defaultCall(CallSiteArray.java:42) at org.codehaus.groovy.runtime.callsite.AbstractCallSite.call(AbstractCallSite.java:108) at org.codehaus.groovy.runtime.callsite.AbstractCallSite.call(AbstractCallSite.java:124) at org.apache.hadoop.fs.swift.integration.IntegrationTestBase.registerPigResource(IntegrationTestBase.groovy:196) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.codehaus.groovy.runtime.callsite.PogoMetaMethodSite$PogoCachedMethodSiteNoUnwrapNoCoerce.invoke(PogoMetaMethodSite.java:267) at org.codehaus.groovy.runtime.callsite.PogoMetaMethodSite.callCurrent(PogoMetaMethodSite.java:52) at org.codehaus.groovy.runtime.callsite.CallSiteArray.defaultCallCurrent(CallSiteArray.java:46) at org.codehaus.groovy.runtime.callsite.AbstractCallSite.callCurrent(AbstractCallSite.java:133) at org.codehaus.groovy.runtime.callsite.AbstractCallSite.callCurrent(AbstractCallSite.java:149) at org.apache.hadoop.fs.swift.integration.pig.TestPig.testLoadGeneratedData(TestPig.groovy:48) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:45) at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:15) at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:42) at org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:20) at org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:28) at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:263) at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:68) at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:47) at org.junit.runners.ParentRunner$3.run(ParentRunner.java:231) at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:60) at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:229) at org.junit.runners.ParentRunner.access$000(ParentRunner.java:50) at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:222) at org.junit.runners.ParentRunner.run(ParentRunner.java:300) at org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:264) at org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:153) at org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:124) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.maven.surefire.util.ReflectionUtils.invokeMethodWithArray2(ReflectionUtils.java:208) at org.apache.maven.surefire.booter.ProviderFactory$ProviderProxy.invoke(ProviderFactory.java:158) at org.apache.maven.surefire.booter.ProviderFactory.invokeProvider(ProviderFactory.java:86) at org.apache.maven.surefire.booter.ForkedBooter.runSuitesInProcess(ForkedBooter.java:153) at org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:95) It's not an error, and including the stack is irrelevant most of the time, so it just creates extraneous noise in the logs -and, being logged at ERROR, confusing noise.
          Hide
          Harsh J added a comment -

          It was deprecated when folks added packaging support to Apache Hadoop itself (no longer used, but these changes have stuck since then and even Apache Bigtop uses the new env-vars today): HADOOP-6255. The new env-var is HADOOP_PREFIX for the base, and HADOOP_*_HOME for each component (COMMON, MAPRED, YARN and HDFS)'s library directories.

          Show
          Harsh J added a comment - It was deprecated when folks added packaging support to Apache Hadoop itself (no longer used, but these changes have stuck since then and even Apache Bigtop uses the new env-vars today): HADOOP-6255 . The new env-var is HADOOP_PREFIX for the base, and HADOOP_*_HOME for each component (COMMON, MAPRED, YARN and HDFS)'s library directories.
          Hide
          Bikas Saha added a comment -

          According to Harsh, HADOOP_HOME is deprecated in trunk. In that case, we might have to remove the reference to HADOOP_HOME. Harsh J Can you please point to the jira where HADOOP_HOME was deprecated?

          Show
          Bikas Saha added a comment - According to Harsh, HADOOP_HOME is deprecated in trunk. In that case, we might have to remove the reference to HADOOP_HOME. Harsh J Can you please point to the jira where HADOOP_HOME was deprecated?
          Hide
          Hitesh Shah added a comment -

          @Harsh, there is some static code in Shell.java that requires HADOOP_HOME to be set. As per @Bikas and @Chris, it seems it is only required for correct behavior on windows.

          Show
          Hitesh Shah added a comment - @Harsh, there is some static code in Shell.java that requires HADOOP_HOME to be set. As per @Bikas and @Chris, it seems it is only required for correct behavior on windows.
          Hide
          Harsh J added a comment -

          I'm unsure where this problem applies (trunk is mentioned as affects) but as you'd already know, HADOOP_HOME was deprecated already and should also be removed in trunk; and we don't ship separate tarballs as of yet so is it a new thing supporting such a deployment?

          Sorry if I'm missing context here.

          Show
          Harsh J added a comment - I'm unsure where this problem applies (trunk is mentioned as affects) but as you'd already know, HADOOP_HOME was deprecated already and should also be removed in trunk; and we don't ship separate tarballs as of yet so is it a new thing supporting such a deployment? Sorry if I'm missing context here.
          Hide
          Chris Nauroth added a comment -

          I think we can safely relax the logic in Shell so that it's only a hard dependency when running on Windows, where we need to find winutils.exe. This can be satisfied from either the hadoop.home.dir system property or the HADOOP_HOME environment variable, without necessarily requiring both to be set.

          Show
          Chris Nauroth added a comment - I think we can safely relax the logic in Shell so that it's only a hard dependency when running on Windows, where we need to find winutils.exe. This can be satisfied from either the hadoop.home.dir system property or the HADOOP_HOME environment variable, without necessarily requiring both to be set.
          Hide
          Bikas Saha added a comment -

          This check was originally added to branch-1-win where a lot of stuff was getting resolved via HADOOP_HOME.

          Show
          Bikas Saha added a comment - This check was originally added to branch-1-win where a lot of stuff was getting resolved via HADOOP_HOME.

            People

            • Assignee:
              Arpit Gupta
              Reporter:
              Hitesh Shah
            • Votes:
              0 Vote for this issue
              Watchers:
              15 Start watching this issue

              Dates

              • Created:
                Updated:

                Development