Uploaded image for project: 'Oozie'
  1. Oozie
  2. OOZIE-2066

oozie.launcher.mapreduce.task.classpath.user.precedence is not respected

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Critical
    • Resolution: Not A Problem
    • Affects Version/s: None
    • Fix Version/s: None
    • Component/s: None
    • Labels:
      None
    • Environment:

      yarn

      Description

      When using MR2, the user classpath precedence is not read from the job configuration.

      When submitting a job, the following configuration should result in the java action running with the user classpath before the Hadoop jars.

      <property>
        <name>oozie.launcher.mapreduce.task.classpath.user.precedence</name>
        <value>true</value>
      </property>
      

      When used in a Java action:

      <action name="run-test">
        <java>
          <job-tracker>c1n2.gbif.org:8032</job-tracker>
          <name-node>hdfs://c1n1.gbif.org:8020</name-node>
          <main-class>test.CPTest</main-class>
        </java>
        <ok to="end" />
        <error to="kill" />
      </action>
      

      However, it is not...

      There is a workaround, by setting this on the task directly in the workflow:

      <action name="run-test">
        <java>
          <job-tracker>c1n2.gbif.org:8032</job-tracker>
          <name-node>hdfs://c1n1.gbif.org:8020</name-node>
          <configuration>
            <property>
              <name>oozie.launcher.mapreduce.task.classpath.user.precedence</name>
              <value>true</value>
            </property>
          </configuration>
          <main-class>test.CPTest</main-class>
        </java>
        <ok to="end" />
        <error to="kill" />
      </action>
      
      1. test-wf.zip
        8 kB
        Tim Robertson

        Activity

        Hide
        timrobertson100 Tim Robertson added a comment -

        In the example attached, a simple Java action uses a Guava method that requires Guava 17.0+. Hadoop (CDH5.2) will bring in 11.0.2, and thus this demonstrates that it fails as it picks up the ancient version. The workaround above can be used to show how it can be circumvented.

        To build, install and run the example, requires maven and a working Hadoop (Yarn) environment:

        mvn clean package
        hadoop fs -put target/workflow-hello-world /user/tim/workflow-hello-world
        oozie job -oozie http://c1n2.gbif.org:11000/oozie/ -config example-jobs/hello-world.xml -run
        

        The job will succeed but the logs for the task will show:

        Calling a Guava 17+ method!
        FAIL: com.google.common.io.ByteStreams.newDataOutput(Ljava/io/ByteArrayOutputStream;)Lcom/google/common/io/ByteArrayDataOutput;
        

        Adding the workaround above will succeed.

        Show
        timrobertson100 Tim Robertson added a comment - In the example attached, a simple Java action uses a Guava method that requires Guava 17.0+. Hadoop (CDH5.2) will bring in 11.0.2, and thus this demonstrates that it fails as it picks up the ancient version. The workaround above can be used to show how it can be circumvented. To build, install and run the example, requires maven and a working Hadoop (Yarn) environment: mvn clean package hadoop fs -put target/workflow-hello-world /user/tim/workflow-hello-world oozie job -oozie http: //c1n2.gbif.org:11000/oozie/ -config example-jobs/hello-world.xml -run The job will succeed but the logs for the task will show: Calling a Guava 17+ method! FAIL: com.google.common.io.ByteStreams.newDataOutput(Ljava/io/ByteArrayOutputStream;)Lcom/google/common/io/ByteArrayDataOutput; Adding the workaround above will succeed.
        Hide
        rkanter Robert Kanter added a comment -

        I think you're misunderstanding how this works. Any "oozie.launcher.*" and Hadoop (i.e. "mapreduce.*") properties go in the <configuration> section of the action, not the job.properties (or in your case, hello-world.xml). The "workaround" you found is actually the correct way to do this. This is behaving as expected.

        Show
        rkanter Robert Kanter added a comment - I think you're misunderstanding how this works. Any "oozie.launcher.*" and Hadoop (i.e. "mapreduce.*") properties go in the <configuration> section of the action, not the job.properties (or in your case, hello-world.xml). The "workaround" you found is actually the correct way to do this. This is behaving as expected.
        Hide
        nirmal.hbti Nirmal Kumar added a comment -

        Doesn't seems to be working.

        I tried the same example and still getting the same error even after the workaround mentioned:
        >>> Invoking Main class now >>>

        Fetching child yarn jobs
        Could not find Yarn tags property oozie.child.mapreduce.job.tagsMain class : test.CPTest
        Arguments :

        Calling a method that requires Guava 17+
        FAIL: com.google.common.io.ByteStreams.newDataOutput(Ljava/io/ByteArrayOutputStream;)Lcom/google/common/io/ByteArrayDataOutput;

        <<< Invocation of Main class completed <<<

        Oozie Launcher ends

        Show
        nirmal.hbti Nirmal Kumar added a comment - Doesn't seems to be working. I tried the same example and still getting the same error even after the workaround mentioned: >>> Invoking Main class now >>> Fetching child yarn jobs Could not find Yarn tags property oozie.child.mapreduce.job.tagsMain class : test.CPTest Arguments : Calling a method that requires Guava 17+ FAIL: com.google.common.io.ByteStreams.newDataOutput(Ljava/io/ByteArrayOutputStream;)Lcom/google/common/io/ByteArrayDataOutput; <<< Invocation of Main class completed <<< Oozie Launcher ends
        Hide
        rkanter Robert Kanter added a comment -

        The action output should also print the classpath, what does it show?

        Show
        rkanter Robert Kanter added a comment - The action output should also print the classpath, what does it show?
        Hide
        nirmal.hbti Nirmal Kumar added a comment -

        It is working with the following property though:

        <property>
        <name>oozie.launcher.mapreduce.user.classpath.first</name>
        <value>true</value>
        </property>

        In this case the workflow/lib jars are preceded before the Hadoop native jars.
        Earlier Hadoop jars preceded the workflow/lib jars.

        Nirmal

        Show
        nirmal.hbti Nirmal Kumar added a comment - It is working with the following property though: <property> <name>oozie.launcher.mapreduce.user.classpath.first</name> <value>true</value> </property> In this case the workflow/lib jars are preceded before the Hadoop native jars. Earlier Hadoop jars preceded the workflow/lib jars. Nirmal
        Hide
        rkanter Robert Kanter added a comment -

        The way this works, is that Oozie simply takes any Hadoop/Mapreduce property that starts with "oozie.launcher" and applies it to the Launcher Job; it doesn't do anything special for oozie.launcher.mapreduce.task.classpath.user.precedence, oozie.launcher.mapreduce.user.classpath.first, or any others. So if oozie.launcher.mapreduce.task.classpath.user.precedence is not working, you can blame Hadoop for having at least three similarly named properties for this.

        Show
        rkanter Robert Kanter added a comment - The way this works, is that Oozie simply takes any Hadoop/Mapreduce property that starts with "oozie.launcher" and applies it to the Launcher Job; it doesn't do anything special for oozie.launcher.mapreduce.task.classpath.user.precedence , oozie.launcher.mapreduce.user.classpath.first , or any others. So if oozie.launcher.mapreduce.task.classpath.user.precedence is not working, you can blame Hadoop for having at least three similarly named properties for this .

          People

          • Assignee:
            Unassigned
            Reporter:
            timrobertson100 Tim Robertson
          • Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development