Uploaded image for project: 'Oozie'
  1. Oozie
  2. OOZIE-2482

Pyspark job fails with Oozie

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Closed
    • Major
    • Resolution: Fixed
    • 4.2.0
    • 4.3.0
    • core, workflow
    • None
    • Hadoop 2.7.2, Spark 1.6.0 on Yarn, Oozie 4.2.0
      Cluster secured with Kerberos

    Description

      Hello,

      I'm trying to run pi.py example in a pyspark job with Oozie. Every try I made failed for the same reason: key not found: SPARK_HOME.
      Note: A scala job works well in the environment with Oozie.

      The logs on the executors are:

      SLF4J: Class path contains multiple SLF4J bindings.
      SLF4J: Found binding in [jar:file:/mnt/hd4/hadoop/yarn/local/filecache/145/slf4j-log4j12-1.6.6.jar!/org/slf4j/impl/StaticLoggerBinder.class]
      SLF4J: Found binding in [jar:file:/mnt/hd2/hadoop/yarn/local/filecache/155/spark-assembly-1.6.0-hadoop2.7.1.jar!/org/slf4j/impl/StaticLoggerBinder.class]
      SLF4J: Found binding in [jar:file:/opt/application/Hadoop/hadoop-2.7.2/share/hadoop/common/lib/slf4j-log4j12-1.7.10.jar!/org/slf4j/impl/StaticLoggerBinder.class]
      SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
      SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
      log4j:ERROR setFile(null,true) call failed.
      java.io.FileNotFoundException: /mnt/hd7/hadoop/yarn/log/application_1454673025841_13136/container_1454673025841_13136_01_000001 (Is a directory)
              at java.io.FileOutputStream.open(Native Method)
              at java.io.FileOutputStream.<init>(FileOutputStream.java:221)
              at java.io.FileOutputStream.<init>(FileOutputStream.java:142)
              at org.apache.log4j.FileAppender.setFile(FileAppender.java:294)
              at org.apache.log4j.FileAppender.activateOptions(FileAppender.java:165)
              at org.apache.hadoop.yarn.ContainerLogAppender.activateOptions(ContainerLogAppender.java:55)
              at org.apache.log4j.config.PropertySetter.activate(PropertySetter.java:307)
              at org.apache.log4j.config.PropertySetter.setProperties(PropertySetter.java:172)
              at org.apache.log4j.config.PropertySetter.setProperties(PropertySetter.java:104)
              at org.apache.log4j.PropertyConfigurator.parseAppender(PropertyConfigurator.java:809)
              at org.apache.log4j.PropertyConfigurator.parseCategory(PropertyConfigurator.java:735)
              at org.apache.log4j.PropertyConfigurator.configureRootCategory(PropertyConfigurator.java:615)
              at org.apache.log4j.PropertyConfigurator.doConfigure(PropertyConfigurator.java:502)
              at org.apache.log4j.PropertyConfigurator.doConfigure(PropertyConfigurator.java:547)
              at org.apache.log4j.helpers.OptionConverter.selectAndConfigure(OptionConverter.java:483)
              at org.apache.log4j.LogManager.<clinit>(LogManager.java:127)
              at org.slf4j.impl.Log4jLoggerFactory.getLogger(Log4jLoggerFactory.java:64)
              at org.slf4j.LoggerFactory.getLogger(LoggerFactory.java:285)
              at org.apache.commons.logging.impl.SLF4JLogFactory.getInstance(SLF4JLogFactory.java:155)
              at org.apache.commons.logging.impl.SLF4JLogFactory.getInstance(SLF4JLogFactory.java:132)
              at org.apache.commons.logging.LogFactory.getLog(LogFactory.java:275)
              at org.apache.hadoop.service.AbstractService.<clinit>(AbstractService.java:43)
      Using properties file: null
      Parsed arguments:
        master                  yarn-master
        deployMode              cluster
        executorMemory          null
        executorCores           null
        totalExecutorCores      null
        propertiesFile          null
        driverMemory            null
        driverCores             null
        driverExtraClassPath    null
        driverExtraLibraryPath  null
        driverExtraJavaOptions  null
        supervise               false
        queue                   null
        numExecutors            null
        files                   null
        pyFiles                 null
        archives                null
        mainClass               null
        primaryResource         hdfs://hadoopsandbox/User/toto/WORK/Oozie/pyspark/lib/pi.py
        name                    Pysparkpi example
        childArgs               [100]
        jars                    null
        packages                null
        packagesExclusions      null
        repositories            null
        verbose                 true
      
      Spark properties used, including those specified through
       --conf and those from the properties file null:
        spark.executorEnv.SPARK_HOME -> /opt/application/Spark/current
        spark.executorEnv.PYTHONPATH -> /opt/application/Spark/current/python
        spark.yarn.appMasterEnv.SPARK_HOME -> /opt/application/Spark/current
      
      
      Main class:
      org.apache.spark.deploy.yarn.Client
      Arguments:
      --name
      Pysparkpi example
      --primary-py-file
      hdfs://hadoopsandbox/User/toto/WORK/Oozie/pyspark/lib/pi.py
      --class
      org.apache.spark.deploy.PythonRunner
      --arg
      100
      System properties:
      spark.executorEnv.SPARK_HOME -> /opt/application/Spark/current
      spark.executorEnv.PYTHONPATH -> /opt/application/Spark/current/python
      SPARK_SUBMIT -> true
      spark.app.name -> Pysparkpi example
      spark.submit.deployMode -> cluster
      spark.yarn.appMasterEnv.SPARK_HOME -> /opt/application/Spark/current
      spark.yarn.isPython -> true
      spark.master -> yarn-cluster
      Classpath elements:
      
      
      
      Failing Oozie Launcher, Main class [org.apache.oozie.action.hadoop.SparkMain], main() threw exception, key not found: SPARK_HOME
      java.util.NoSuchElementException: key not found: SPARK_HOME
              at scala.collection.MapLike$class.default(MapLike.scala:228)
              at scala.collection.AbstractMap.default(Map.scala:58)
              at scala.collection.MapLike$class.apply(MapLike.scala:141)
              at scala.collection.AbstractMap.apply(Map.scala:58)
              at org.apache.spark.deploy.yarn.Client$$anonfun$findPySparkArchives$2.apply(Client.scala:1045)
              at org.apache.spark.deploy.yarn.Client$$anonfun$findPySparkArchives$2.apply(Client.scala:1044)
              at scala.Option.getOrElse(Option.scala:120)
              at org.apache.spark.deploy.yarn.Client.findPySparkArchives(Client.scala:1044)
              at org.apache.spark.deploy.yarn.Client.createContainerLaunchContext(Client.scala:717)
              at org.apache.spark.deploy.yarn.Client.submitApplication(Client.scala:142)
              at org.apache.spark.deploy.yarn.Client.run(Client.scala:1016)
              at org.apache.spark.deploy.yarn.Client$.main(Client.scala:1076)
              at org.apache.spark.deploy.yarn.Client.main(Client.scala)
              at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
              at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
              at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
              at java.lang.reflect.Method.invoke(Method.java:606)
              at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:731)
              at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:181)
              at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:206)
              at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:121)
              at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
              at org.apache.oozie.action.hadoop.SparkMain.runSpark(SparkMain.java:104)
              at org.apache.oozie.action.hadoop.SparkMain.run(SparkMain.java:95)
              at org.apache.oozie.action.hadoop.LauncherMain.run(LauncherMain.java:47)
              at org.apache.oozie.action.hadoop.SparkMain.main(SparkMain.java:38)
              at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
              at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
              at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
              at java.lang.reflect.Method.invoke(Method.java:606)
              at org.apache.oozie.action.hadoop.LauncherMapper.map(LauncherMapper.java:236)
              at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:54)
              at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:453)
              at org.apache.hadoop.mapred.MapTask.run(MapTask.java:343)
              at org.apache.hadoop.mapred.LocalContainerLauncher$EventHandler.runSubtask(LocalContainerLauncher.java:380)
              at org.apache.hadoop.mapred.LocalContainerLauncher$EventHandler.runTask(LocalContainerLauncher.java:301)
              at org.apache.hadoop.mapred.LocalContainerLauncher$EventHandler.access$200(LocalContainerLauncher.java:187)
              at org.apache.hadoop.mapred.LocalContainerLauncher$EventHandler$1.run(LocalContainerLauncher.java:230)
              at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
              at java.util.concurrent.FutureTask.run(FutureTask.java:262)
              at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
              at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
              at java.lang.Thread.run(Thread.java:745)
      log4j:WARN No appenders could be found for logger (org.apache.hadoop.mapreduce.v2.app.MRAppMaster).
      log4j:WARN Please initialize the log4j system properly.
      log4j:WARN See http://logging.apache.org/log4j/1.2/faq.html#noconfig for more info.
      

      The workflow used for Oozie is the following:

      <workflow-app xmlns='uri:oozie:workflow:0.5' name='PysparkPi-test'>
              <start to='spark-node' />
              <action name='spark-node'>
                      <spark xmlns="uri:oozie:spark-action:0.1">
                              <job-tracker>${jobTracker}</job-tracker>
                              <name-node>${nameNode}</name-node>
                              <master>${master}</master>
                              <mode>${mode}</mode>
                              <name>Pysparkpi example</name>
                              <class></class>
                              <jar>${nameNode}/User/toto/WORK/Oozie/pyspark/lib/pi.py</jar>
                              <spark-opts>--conf spark.yarn.appMasterEnv.SPARK_HOME=/opt/application/Spark/current --conf spark.executorEnv.SPARK_HOME=/opt/application/Spark/current --conf spark.executorEnv.PYTHONPATH=/opt/application/Spark/current/python</spark-opts>
                              <arg>100</arg>
                      </spark>
                      <ok to="end" />
                      <error to="fail" />
              </action>
              <kill name="fail">
                      <message>Workflow failed, error message[${wf:errorMessage(wf:lastErrorNode())}]</message>
              </kill>
              <end name='end' />
      </workflow-app>
      

      I also created a JIRA for Spark: https://issues.apache.org/jira/browse/SPARK-13679

      Attachments

        1. OOZIE-2482-1.patch
          29 kB
          Satish Saley
        2. OOZIE-2482-2.patch
          524 kB
          Satish Saley
        3. OOZIE-2482-3.patch
          535 kB
          Satish Saley
        4. OOZIE-2482-4.patch
          524 kB
          Gézapeti
        5. OOZIE-2482-5.patch
          522 kB
          Gézapeti
        6. OOZIE-2482-6.patch
          523 kB
          Gézapeti
        7. OOZIE-2482-zip.patch
          36 kB
          Gézapeti
        8. py4j-0.9-src.zip
          44 kB
          Satish Saley
        9. pyspark.zip
          349 kB
          Satish Saley

        Issue Links

          Activity

            People

              satishsaley Satish Saley
              BigDataOrange Alexandre Linte
              Votes:
              1 Vote for this issue
              Watchers:
              13 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: