Uploaded image for project: 'Oozie'
  1. Oozie
  2. OOZIE-2606

Set spark.yarn.jars to fix Spark 2.0 with Oozie

    Details

    • Type: Bug
    • Status: Closed
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: 4.2.0
    • Fix Version/s: 4.3.0
    • Component/s: core
    • Labels:

      Description

      Oozie adds all of the jars in the Oozie Spark sharelib to the DistributedCache such that all jars will be present in the current working directory of the YARN container (as well as in the container classpath). However, this is not quite enough to make Spark 2.0 work, since Spark 2.0 by default looks for the jars in assembly/target/scala-2.11/jars [1] (as if it is a locally built distribution for development) and will not find them in the current working directory.

      To fix this, we can set spark.yarn.jars to *.jar so that it finds the jars in the current working directory rather than looking in the wrong place. [2]

      [1] https://github.com/apache/spark/blob/v2.0.0-rc2/launcher/src/main/java/org/apache/spark/launcher/CommandBuilderUtils.java#L357
      [2] https://github.com/apache/spark/blob/v2.0.0-rc2/yarn/src/main/scala/org/apache/spark/deploy/yarn/Client.scala#L476

      Note: This property will be ignored by Spark 1.x.

        Attachments

        1. OOZIE-2606.patch
          5 kB
          Jonathan Kelly
        2. OOZIE-2606-2.patch
          9 kB
          Satish Subhashrao Saley
        3. OOZIE-2606-3.patch
          10 kB
          Satish Subhashrao Saley
        4. OOZIE-2606-4.patch
          12 kB
          Satish Subhashrao Saley

          Activity

            People

            • Assignee:
              satishsaley Satish Subhashrao Saley
              Reporter:
              jonathak Jonathan Kelly
            • Votes:
              1 Vote for this issue
              Watchers:
              11 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: