Hadoop Common
  1. Hadoop Common
  2. HADOOP-2838

Add HADOOP_LIBRARY_PATH config setting so Hadoop will include external directories for jni

    Details

    • Type: Improvement Improvement
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: 0.16.0
    • Fix Version/s: 0.21.0
    • Component/s: None
    • Labels:
      None
    • Hadoop Flags:
      Reviewed

      Description

      Currently there is no way to configure Hadoop to use external JNI directories. I propose we add a new variable like HADOOP_CLASS_PATH that is added to the JAVA_LIBRARY_PATH before the process is run.

      Now the users can set environment variables using mapred.child.env. They can do the following
      X=Y : set X to Y
      X=$X:Y : Append Y to X (which should be taken from the tasktracker)

      1. HADOOP-2838-v1.0.patch
        8 kB
        Amar Kamat
      2. HADOOP-2838-v1.1.patch
        8 kB
        Amar Kamat
      3. HADOOP-2838-v2.2.patch
        19 kB
        Amar Kamat
      4. HADOOP-2838-v2.2-branch-20-example.patch
        12 kB
        Amar Kamat

        Issue Links

          Activity

          Hide
          Arun C Murthy added a comment - - edited

          I propose we generalise this feature and let users set any environment variable for their tasks... maybe we should add a mapred.child.env configuration variable which takes in a comma-separated list of environment variables:

          <property>
            <name>mapred.child.env</name>
            <value>LD_LIBRARY_PATH=/foo,PATH=/blah</value>
          </property>
          

          Of course we should ensure that each of the env-vars in that list is appended to the env inherited from the parent TT.

          Thoughts?

          Show
          Arun C Murthy added a comment - - edited I propose we generalise this feature and let users set any environment variable for their tasks... maybe we should add a mapred.child.env configuration variable which takes in a comma-separated list of environment variables: <property> <name>mapred.child.env</name> <value>LD_LIBRARY_PATH=/foo,PATH=/blah</value> </property> Of course we should ensure that each of the env-vars in that list is appended to the env inherited from the parent TT. Thoughts?
          Hide
          Christian Kunz added a comment -

          I see a use case for this proposal.

          HADOOP-2867 allows to load shared libraries via symbolic links pointing to localized cache files.
          This would apply to libhdfs library

          On the other hand, libhdfs is part of the Hadoop release and is likely locally available on all the nodes. By being able to set LD_LIBRARY_PATH to some static location, the need to use DistributedCache for dynamically loading libhdfs could be avoided.

          Show
          Christian Kunz added a comment - I see a use case for this proposal. HADOOP-2867 allows to load shared libraries via symbolic links pointing to localized cache files. This would apply to libhdfs library On the other hand, libhdfs is part of the Hadoop release and is likely locally available on all the nodes. By being able to set LD_LIBRARY_PATH to some static location, the need to use DistributedCache for dynamically loading libhdfs could be avoided.
          Hide
          Amar Kamat added a comment -

          +1 for mapred.child.env which will be a comma separated.

          Show
          Amar Kamat added a comment - +1 for mapred.child.env which will be a comma separated.
          Hide
          Amar Kamat added a comment -

          Attaching a patch that provides the facility to add/set/append env variables for child processes. Result of test-patch

          [exec] +1 overall.  
               [exec] 
               [exec]     +1 @author.  The patch does not contain any @author tags.
               [exec] 
               [exec]     +1 tests included.  The patch appears to include 3 new or modified tests.
               [exec] 
               [exec]     +1 javadoc.  The javadoc tool did not generate any warning messages.
               [exec] 
               [exec]     +1 javac.  The applied patch does not increase the total number of javac compiler warnings.
               [exec] 
               [exec]     +1 findbugs.  The patch does not introduce any new Findbugs warnings.
               [exec] 
               [exec]     +1 Eclipse classpath. The patch retains Eclipse classpath integrity.
               [exec] 
               [exec]     +1 release audit.  The applied patch does not increase the total number of release audit warnings.
          

          Todo:

          1. ant test
          2. optimize and re-iterate over the patch.
          Show
          Amar Kamat added a comment - Attaching a patch that provides the facility to add/set/append env variables for child processes. Result of test-patch [exec] +1 overall. [exec] [exec] +1 @author. The patch does not contain any @author tags. [exec] [exec] +1 tests included. The patch appears to include 3 new or modified tests. [exec] [exec] +1 javadoc. The javadoc tool did not generate any warning messages. [exec] [exec] +1 javac. The applied patch does not increase the total number of javac compiler warnings. [exec] [exec] +1 findbugs. The patch does not introduce any new Findbugs warnings. [exec] [exec] +1 Eclipse classpath. The patch retains Eclipse classpath integrity. [exec] [exec] +1 release audit. The applied patch does not increase the total number of release audit warnings. Todo: ant test optimize and re-iterate over the patch.
          Hide
          Amar Kamat added a comment -

          Attaching a new patch with a bug fix. Result of test-patch

          [exec] +1 overall.  
               [exec] 
               [exec]     +1 @author.  The patch does not contain any @author tags.
               [exec] 
               [exec]     +1 tests included.  The patch appears to include 3 new or modified tests.
               [exec] 
               [exec]     +1 javadoc.  The javadoc tool did not generate any warning messages.
               [exec] 
               [exec]     +1 javac.  The applied patch does not increase the total number of javac compiler warnings.
               [exec] 
               [exec]     +1 findbugs.  The patch does not introduce any new Findbugs warnings.
               [exec] 
               [exec]     +1 Eclipse classpath. The patch retains Eclipse classpath integrity.
               [exec] 
               [exec]     +1 release audit.  The applied patch does not increase the total number of release audit warnings.
          

          Ant tests passed on my box. Currently trying to optimize the patch.

          Show
          Amar Kamat added a comment - Attaching a new patch with a bug fix. Result of test-patch [exec] +1 overall. [exec] [exec] +1 @author. The patch does not contain any @author tags. [exec] [exec] +1 tests included. The patch appears to include 3 new or modified tests. [exec] [exec] +1 javadoc. The javadoc tool did not generate any warning messages. [exec] [exec] +1 javac. The applied patch does not increase the total number of javac compiler warnings. [exec] [exec] +1 findbugs. The patch does not introduce any new Findbugs warnings. [exec] [exec] +1 Eclipse classpath. The patch retains Eclipse classpath integrity. [exec] [exec] +1 release audit. The applied patch does not increase the total number of release audit warnings. Ant tests passed on my box. Currently trying to optimize the patch.
          Hide
          Sharad Agarwal added a comment -

          Adding a new MiniDFSCluster based test for this looks to be an overkill. Instead can we write a test by separating this functionality in a method and just testing that? Alternatively we can add to one of the existing MiniDFSCluster test.

          Show
          Sharad Agarwal added a comment - Adding a new MiniDFSCluster based test for this looks to be an overkill. Instead can we write a test by separating this functionality in a method and just testing that? Alternatively we can add to one of the existing MiniDFSCluster test.
          Hide
          Todd Lipcon added a comment -

          Using a comma separated conf variable here seems slightly dangerous. I imagine there must be some cases where people want to pass commas as part of their environment. I don't have a concrete example, but it seems it's only a matter of time until some streaming application comes along and really needs a comma in its environment, and then we're stuck dealing with escaping, etc.

          What about using the entire mapred.child.env namespace, and iterating over it? So mapred.child.env.LD_LIBRARY_PATH would hold the value for that environment variable.

          Unfortunately this would either involve iterating over and filtering all of the conf vars, or switching the "Properties" class used for storage inside Configuration to be a SortedMap so we can do a prefix match. I don't particularly like the former, and the latter is a bigger change, but the interface does seem simpler and also more robust. For example, a sysadmin could set a final property of LD_LIBRARY_PATH="path/to/some.so" to provide some kind of security sandboxing while still allowing the user to modify other variables in the environment.

          Show
          Todd Lipcon added a comment - Using a comma separated conf variable here seems slightly dangerous. I imagine there must be some cases where people want to pass commas as part of their environment. I don't have a concrete example, but it seems it's only a matter of time until some streaming application comes along and really needs a comma in its environment, and then we're stuck dealing with escaping, etc. What about using the entire mapred.child.env namespace, and iterating over it? So mapred.child.env.LD_LIBRARY_PATH would hold the value for that environment variable. Unfortunately this would either involve iterating over and filtering all of the conf vars, or switching the "Properties" class used for storage inside Configuration to be a SortedMap so we can do a prefix match. I don't particularly like the former, and the latter is a bigger change, but the interface does seem simpler and also more robust. For example, a sysadmin could set a final property of LD_LIBRARY_PATH="path/to/some.so" to provide some kind of security sandboxing while still allowing the user to modify other variables in the environment.
          Hide
          Amar Kamat added a comment -

          Todd, I think for now we can keep this simple and get the basic functionality (i.e for paths) working. We can change/add if needed. Thoughts?

          Show
          Amar Kamat added a comment - Todd, I think for now we can keep this simple and get the basic functionality (i.e for paths) working. We can change/add if needed. Thoughts?
          Hide
          Todd Lipcon added a comment -

          Yep, that seems reasonable to me, Amar.

          Show
          Todd Lipcon added a comment - Yep, that seems reasonable to me, Amar.
          Hide
          Amar Kamat added a comment -

          Attaching a patch incorporating Sharad's comments. Result of test-patch

           [exec] +1 overall.  
               [exec] 
               [exec]     +1 @author.  The patch does not contain any @author tags.
               [exec] 
               [exec]     +1 tests included.  The patch appears to include 6 new or modified tests.
               [exec] 
               [exec]     +1 javadoc.  The javadoc tool did not generate any warning messages.
               [exec] 
               [exec]     +1 javac.  The applied patch does not increase the total number of javac compiler warnings.
               [exec] 
               [exec]     +1 findbugs.  The patch does not introduce any new Findbugs warnings.
               [exec] 
               [exec]     +1 Eclipse classpath. The patch retains Eclipse classpath integrity.
               [exec] 
               [exec]     +1 release audit.  The applied patch does not increase the total number of release audit warnings.
          

          Changes :

          1. Improved the test case
          2. Merged TestMiniMRTaskTempDir with env checks.
            Running ant tests now.
          Show
          Amar Kamat added a comment - Attaching a patch incorporating Sharad's comments. Result of test-patch [exec] +1 overall. [exec] [exec] +1 @author. The patch does not contain any @author tags. [exec] [exec] +1 tests included. The patch appears to include 6 new or modified tests. [exec] [exec] +1 javadoc. The javadoc tool did not generate any warning messages. [exec] [exec] +1 javac. The applied patch does not increase the total number of javac compiler warnings. [exec] [exec] +1 findbugs. The patch does not introduce any new Findbugs warnings. [exec] [exec] +1 Eclipse classpath. The patch retains Eclipse classpath integrity. [exec] [exec] +1 release audit. The applied patch does not increase the total number of release audit warnings. Changes : Improved the test case Merged TestMiniMRTaskTempDir with env checks. Running ant tests now.
          Hide
          Amar Kamat added a comment -

          Ant tests passed on my box.

          Show
          Amar Kamat added a comment - Ant tests passed on my box.
          Hide
          Sharad Agarwal added a comment -

          +1 patch looks fine to me.

          Show
          Sharad Agarwal added a comment - +1 patch looks fine to me.
          Hide
          Sharad Agarwal added a comment -

          I just committed this. Thanks Amar!

          Show
          Sharad Agarwal added a comment - I just committed this. Thanks Amar!
          Hide
          Amar Kamat added a comment -

          Not to be committed to branch 20. Just an example patch.

          Show
          Amar Kamat added a comment - Not to be committed to branch 20. Just an example patch.
          Hide
          Hemanth Yamijala added a comment -

          Stumbled on this patch.. what happens if the value of mapred.child.env does not have a well formed string - like doesn't have "key=value" but only key ? Has this been taken care of ?

          Show
          Hemanth Yamijala added a comment - Stumbled on this patch.. what happens if the value of mapred.child.env does not have a well formed string - like doesn't have "key=value" but only key ? Has this been taken care of ?
          Hide
          Amar Kamat added a comment -

          Stumbled on this patch.. what happens if the value of mapred.child.env does not have a well formed string - like doesn't have "key=value" but only key ? Has this been taken care of ?

          TaskRunner will fail in child launch leading to task failure ultimately leading to job failure. I see it as something similar to bad mapred code.

          Show
          Amar Kamat added a comment - Stumbled on this patch.. what happens if the value of mapred.child.env does not have a well formed string - like doesn't have "key=value" but only key ? Has this been taken care of ? TaskRunner will fail in child launch leading to task failure ultimately leading to job failure. I see it as something similar to bad mapred code.
          Hide
          Amar Kamat added a comment -

          I tried that with the testcase and the job fails. Wondering if this requires to be added in the testcase.

          Show
          Amar Kamat added a comment - I tried that with the testcase and the job fails. Wondering if this requires to be added in the testcase.
          Hide
          Hemanth Yamijala added a comment -

          Will it fail mysteriously or will it be easy to know that the failure is because of the invalid syntax ? If not, I think it would be useful to make it easy to know this is the cause in some way. Maybe through the diagnostic string.

          Show
          Hemanth Yamijala added a comment - Will it fail mysteriously or will it be easy to know that the failure is because of the invalid syntax ? If not, I think it would be useful to make it easy to know this is the cause in some way. Maybe through the diagnostic string.
          Hide
          Robert Chansler added a comment -

          Editorial pass over all release notes prior to publication of 0.21.

          Show
          Robert Chansler added a comment - Editorial pass over all release notes prior to publication of 0.21.

            People

            • Assignee:
              Amar Kamat
              Reporter:
              Owen O'Malley
            • Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development