Uploaded image for project: 'Oozie'
  1. Oozie
  2. OOZIE-3159

Spark Action fails because of absence of hadoop mapreduce jar(s)

Details

    • Bug
    • Status: Closed
    • Blocker
    • Resolution: Fixed
    • None
    • 5.0.0b1
    • None
    • None

    Description

      OOZIE-2869 removed map reduce dependencies from getting added to Spark action. Spark action uses
      org.apache.hadoop.filecache.DistributedCache. It is not available anymore in Spack action's classpath, causing it to fail.

      java.lang.reflect.InvocationTargetException
      	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
      	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
      	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
      	at java.lang.reflect.Method.invoke(Method.java:498)
      	at org.apache.oozie.action.hadoop.LauncherAM.runActionMain(LauncherAM.java:412)
      	at org.apache.oozie.action.hadoop.LauncherAM.access$300(LauncherAM.java:56)
      	at org.apache.oozie.action.hadoop.LauncherAM$2.run(LauncherAM.java:225)
      	at java.security.AccessController.doPrivileged(Native Method)
      	at javax.security.auth.Subject.doAs(Subject.java:422)
      	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628)
      	at org.apache.oozie.action.hadoop.LauncherAM.run(LauncherAM.java:219)
      	at org.apache.oozie.action.hadoop.LauncherAM$1.run(LauncherAM.java:155)
      	at java.security.AccessController.doPrivileged(Native Method)
      	at javax.security.auth.Subject.doAs(Subject.java:422)
      	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628)
      	at org.apache.oozie.action.hadoop.LauncherAM.main(LauncherAM.java:142)
      Caused by: java.lang.NoClassDefFoundError: org/apache/hadoop/filecache/DistributedCache
      	at org.apache.oozie.action.hadoop.SparkArgsExtractor.extract(SparkArgsExtractor.java:309)
      	at org.apache.oozie.action.hadoop.SparkMain.run(SparkMain.java:74)
      	at org.apache.oozie.action.hadoop.LauncherMain.run(LauncherMain.java:101)
      	at org.apache.oozie.action.hadoop.SparkMain.main(SparkMain.java:60)
      	... 16 more
      Caused by: java.lang.ClassNotFoundException: org.apache.hadoop.filecache.DistributedCache
      	at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
      	at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
      	at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:331)
      	at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
      	... 20 more
      Failing Oozie Launcher, org/apache/hadoop/filecache/DistributedCache
      java.lang.NoClassDefFoundError: org/apache/hadoop/filecache/DistributedCache
      	at org.apache.oozie.action.hadoop.SparkArgsExtractor.extract(SparkArgsExtractor.java:309)
      	at org.apache.oozie.action.hadoop.SparkMain.run(SparkMain.java:74)
      	at org.apache.oozie.action.hadoop.LauncherMain.run(LauncherMain.java:101)
      	at org.apache.oozie.action.hadoop.SparkMain.main(SparkMain.java:60)
      	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
      	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
      	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
      	at java.lang.reflect.Method.invoke(Method.java:498)
      	at org.apache.oozie.action.hadoop.LauncherAM.runActionMain(LauncherAM.java:412)
      	at org.apache.oozie.action.hadoop.LauncherAM.access$300(LauncherAM.java:56)
      	at org.apache.oozie.action.hadoop.LauncherAM$2.run(LauncherAM.java:225)
      	at java.security.AccessController.doPrivileged(Native Method)
      	at javax.security.auth.Subject.doAs(Subject.java:422)
      	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628)
      	at org.apache.oozie.action.hadoop.LauncherAM.run(LauncherAM.java:219)
      	at org.apache.oozie.action.hadoop.LauncherAM$1.run(LauncherAM.java:155)
      	at java.security.AccessController.doPrivileged(Native Method)
      	at javax.security.auth.Subject.doAs(Subject.java:422)
      	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628)
      	at org.apache.oozie.action.hadoop.LauncherAM.main(LauncherAM.java:142)
      Caused by: java.lang.ClassNotFoundException: org.apache.hadoop.filecache.DistributedCache
      	at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
      	at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
      	at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:331)
      	at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
      	... 20 more
      Oozie Launcher, uploading action data to HDFS sequence file: hdfs://localhost:8020/user/saley/oozie-sale/0000009-180112124633268-oozie-sale-W/spark-node--spark/action-data.seq
      

      I enable adding map reduce jars by setting oozie.launcher.oozie.action.mapreduce.needed.for to true. The launcher job was able to kick the child job. But child job failed with

      2018-01-12 15:00:13,301 [Driver] ERROR org.apache.spark.deploy.yarn.ApplicationMaster  - User class threw exception: java.lang.SecurityException: class "javax.servlet.FilterRegistration"'s signer information does not match signer information of other classes in the same package
      java.lang.SecurityException: class "javax.servlet.FilterRegistration"'s signer information does not match signer information of other classes in the same package
      	at java.lang.ClassLoader.checkCerts(ClassLoader.java:898)
      	at java.lang.ClassLoader.preDefineClass(ClassLoader.java:668)
      	at java.lang.ClassLoader.defineClass(ClassLoader.java:761)
      	at java.security.SecureClassLoader.defineClass(SecureClassLoader.java:142)
      	at java.net.URLClassLoader.defineClass(URLClassLoader.java:467)
      	at java.net.URLClassLoader.access$100(URLClassLoader.java:73)
      	at java.net.URLClassLoader$1.run(URLClassLoader.java:368)
      	at java.net.URLClassLoader$1.run(URLClassLoader.java:362)
      	at java.security.AccessController.doPrivileged(Native Method)
      	at java.net.URLClassLoader.findClass(URLClassLoader.java:361)
      	at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
      	at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:331)
      	at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
      	at org.spark-project.jetty.servlet.ServletContextHandler.<init>(ServletContextHandler.java:136)
      	at org.spark-project.jetty.servlet.ServletContextHandler.<init>(ServletContextHandler.java:129)
      	at org.spark-project.jetty.servlet.ServletContextHandler.<init>(ServletContextHandler.java:98)
      	at org.apache.spark.ui.JettyUtils$.createServletHandler(JettyUtils.scala:126)
      	at org.apache.spark.ui.JettyUtils$.createServletHandler(JettyUtils.scala:113)
      	at org.apache.spark.ui.WebUI.attachPage(WebUI.scala:78)
      	at org.apache.spark.ui.WebUI$$anonfun$attachTab$1.apply(WebUI.scala:62)
      	at org.apache.spark.ui.WebUI$$anonfun$attachTab$1.apply(WebUI.scala:62)
      	at scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59)
      	at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:47)
      	at org.apache.spark.ui.WebUI.attachTab(WebUI.scala:62)
      	at org.apache.spark.ui.SparkUI.initialize(SparkUI.scala:63)
      	at org.apache.spark.ui.SparkUI.<init>(SparkUI.scala:76)
      	at org.apache.spark.ui.SparkUI$.create(SparkUI.scala:195)
      	at org.apache.spark.ui.SparkUI$.createLiveUI(SparkUI.scala:146)
      	at org.apache.spark.SparkContext.<init>(SparkContext.scala:473)
      	at org.apache.spark.api.java.JavaSparkContext.<init>(JavaSparkContext.scala:59)
      	at org.apache.oozie.example.SparkFileCopy.main(SparkFileCopy.java:35)
      	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
      	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
      	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
      	at java.lang.reflect.Method.invoke(Method.java:498)
      	at org.apache.spark.deploy.yarn.ApplicationMaster$$anon$2.run(ApplicationMaster.scala:542)
      2018-01-12 15:00:13,303 [Driver] INFO  org.apache.spark.deploy.yarn.ApplicationMaster  - Final app status: FAILED, exitCode: 15, (reason: User class threw exception: java.lang.SecurityException: class "javax.servlet.FilterRegistration"'s signer information does not match signer information of other classes in the same package)
      

      I looked around this exception, it is due to servlet-api-2.5.jar which got pulled in by hadoop-common in the spark sharelib. We need to revisit the reason for adding hadoop-common as dependency.

      Attachments

        1. OOZIE-3159-004.patch
          2 kB
          Attila Sasvári
        2. OOZIE-3159-003.patch
          2 kB
          Attila Sasvári
        3. OOZIE-3159-002.patch
          1 kB
          Attila Sasvári
        4. OOZIE-3159-001.patch
          0.9 kB
          Attila Sasvári

        Issue Links

          Activity

            andras.piros Andras Piros added a comment - - edited

            satishsaley a couple of questions that makes reproduction more easy:

            • did you set oozie.action.mapreduce.needed.for.spark to true?
            • what other classpath collisions did you encounter while trying to submit a Spark action?
            • can you please attach workflow.xml of the Spark action? Submission mode standalone / yarn client / yarn cluster would be of extreme interest
            • can you please tell the exact Spark version you want to submit the workflow?
            andras.piros Andras Piros added a comment - - edited satishsaley a couple of questions that makes reproduction more easy: did you set oozie.action.mapreduce.needed.for.spark to true ? what other classpath collisions did you encounter while trying to submit a Spark action? can you please attach workflow.xml of the Spark action? Submission mode standalone / yarn client / yarn cluster would be of extreme interest can you please tell the exact Spark version you want to submit the workflow?

            satishsaley thanks for spotting this!

            Spark relies on a lot of hadoop classes that are in hadoop-common (for example Configuration). Without those dependencies, it simply cannot work with Yarn / HDFS. For example, look at Spark documentation and source code:
            Documentation
            HadoopFileLinesReader

            I have seen similar issues earlier.

            asasvari Attila Sasvári added a comment - satishsaley thanks for spotting this! Spark relies on a lot of hadoop classes that are in hadoop-common (for example Configuration). Without those dependencies, it simply cannot work with Yarn / HDFS. For example, look at Spark documentation and source code: Documentation HadoopFileLinesReader I have seen similar issues earlier.

            I tested with pseudo Hadoop 2.6.0 that setting oozie.action.mapreduce.needed.for.spark to true in oozie-site.xml and submitting spark example workflow

            • with master set to local[*] succeeds
            • with master set to yarn. Executor job fails
              Exception in thread "main" java.lang.NoClassDefFoundError: org/apache/spark/Logging
                      at java.lang.ClassLoader.defineClass1(Native Method)
                      at java.lang.ClassLoader.defineClass(ClassLoader.java:763)
                      at java.security.SecureClassLoader.defineClass(SecureClassLoader.java:142)
                      at java.net.URLClassLoader.defineClass(URLClassLoader.java:467)
                      at java.net.URLClassLoader.access$100(URLClassLoader.java:73)
                      at java.net.URLClassLoader$1.run(URLClassLoader.java:368)
                      at java.net.URLClassLoader$1.run(URLClassLoader.java:362)
                      at java.security.AccessController.doPrivileged(Native Method)
                      at java.net.URLClassLoader.findClass(URLClassLoader.java:361)
                      at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
                      at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:331)
                      at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
                      at org.apache.spark.deploy.yarn.ExecutorLauncher$.main(ApplicationMaster.scala:674)
                      at org.apache.spark.deploy.yarn.ExecutorLauncher.main(ApplicationMaster.scala)
              Caused by: java.lang.ClassNotFoundException: org.apache.spark.Logging
                      at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
                      at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
                      at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:331)
                      at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
                      ... 14 more
              

            Looking at the launcher_container.sh of the executor:

            export CLASSPATH="$PWD:$PWD/__spark_conf__:$PWD/__spark__.jar:$HADOOP_CONF_DIR:$HADOOP_COMMON_HOME/share/hadoop/common/*:$HADOOP_COMMON_HOME/share/hadoop/common/lib/*:$HADOOP_HDFS_HOME/share/hadoop/hdfs/*:$HADOOP_HDFS_HOME/share/hadoop/hdfs/lib/*:$HADOOP_YARN_
            HOME/share/hadoop/yarn/*:$HADOOP_YARN_HOME/share/hadoop/yarn/lib/*:$HADOOP_MAPRED_HOME/share/hadoop/mapreduce/*:$HADOOP_MAPRED_HOM
            E/share/hadoop/mapreduce/lib/*"
            

            files in the container directory of the executor:

            __spark__.jar -> /tmp/hadoop-asasvari/nm-local-dir/usercache/asasvari/filecache/11/spark-yarn_2.10-1.6.1.jar
            __spark_conf__ -> /tmp/hadoop-asasvari/nm-local-dir/usercache/asasvari/filecache/10/__spark_conf__2056190651119303721.zip
            

            As you see there are not too many file and spark-yarn_2.10-1.6.1.jar does not contain org.apache.spark.Logging (it is included in spark-core_2.10-1.6.1.jar but that jar is not localized in the executor container's dir).
            In spark_conf I can see spark_conf__.properties where a lot of properties are set, but some of them might not take any effect (runtime properties?). gezapeti can this be related to OOZIE-3112? I was unable to add extra dependencies or set classpath (tried "spark.executor.extraClassPath" and also "spark.executor.extraLibraryPath".

            asasvari Attila Sasvári added a comment - I tested with pseudo Hadoop 2.6.0 that setting oozie.action.mapreduce.needed.for.spark to true in oozie-site.xml and submitting spark example workflow with master set to local[*] succeeds with master set to yarn . Executor job fails Exception in thread "main" java.lang.NoClassDefFoundError: org/apache/spark/Logging at java.lang. ClassLoader .defineClass1(Native Method) at java.lang. ClassLoader .defineClass( ClassLoader .java:763) at java.security.SecureClassLoader.defineClass(SecureClassLoader.java:142) at java.net.URLClassLoader.defineClass(URLClassLoader.java:467) at java.net.URLClassLoader.access$100(URLClassLoader.java:73) at java.net.URLClassLoader$1.run(URLClassLoader.java:368) at java.net.URLClassLoader$1.run(URLClassLoader.java:362) at java.security.AccessController.doPrivileged(Native Method) at java.net.URLClassLoader.findClass(URLClassLoader.java:361) at java.lang. ClassLoader .loadClass( ClassLoader .java:424) at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:331) at java.lang. ClassLoader .loadClass( ClassLoader .java:357) at org.apache.spark.deploy.yarn.ExecutorLauncher$.main(ApplicationMaster.scala:674) at org.apache.spark.deploy.yarn.ExecutorLauncher.main(ApplicationMaster.scala) Caused by: java.lang.ClassNotFoundException: org.apache.spark.Logging at java.net.URLClassLoader.findClass(URLClassLoader.java:381) at java.lang. ClassLoader .loadClass( ClassLoader .java:424) at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:331) at java.lang. ClassLoader .loadClass( ClassLoader .java:357) ... 14 more Looking at the launcher_container.sh of the executor: export CLASSPATH="$PWD:$PWD/__spark_conf__:$PWD/__spark__.jar:$HADOOP_CONF_DIR:$HADOOP_COMMON_HOME/share/hadoop/common/*:$HADOOP_COMMON_HOME/share/hadoop/common/lib/*:$HADOOP_HDFS_HOME/share/hadoop/hdfs/*:$HADOOP_HDFS_HOME/share/hadoop/hdfs/lib/*:$HADOOP_YARN_ HOME/share/hadoop/yarn/*:$HADOOP_YARN_HOME/share/hadoop/yarn/lib/*:$HADOOP_MAPRED_HOME/share/hadoop/mapreduce/*:$HADOOP_MAPRED_HOM E/share/hadoop/mapreduce/lib/*" files in the container directory of the executor: __spark__.jar -> /tmp/hadoop-asasvari/nm-local-dir/usercache/asasvari/filecache/11/spark-yarn_2.10-1.6.1.jar __spark_conf__ -> /tmp/hadoop-asasvari/nm-local-dir/usercache/asasvari/filecache/10/__spark_conf__2056190651119303721.zip As you see there are not too many file and spark-yarn_2.10-1.6.1.jar does not contain org.apache.spark.Logging (it is included in spark-core_2.10-1.6.1.jar but that jar is not localized in the executor container's dir). In spark_conf I can see spark_conf__.properties where a lot of properties are set, but some of them might not take any effect (runtime properties?). gezapeti can this be related to OOZIE-3112 ? I was unable to add extra dependencies or set classpath (tried "spark.executor.extraClassPath" and also "spark.executor.extraLibraryPath".
            satishsaley Satish Saley added a comment -

            Satish Subhashrao Saley a couple of questions that makes reproduction more easy:
            did you set oozie.action.mapreduce.needed.for.spark to true?

            Yes. I tried setting oozie.action.mapreduce.needed.for.spark, but we read this property from launcherConf
            https://github.com/apache/oozie/blob/branch-5.0.0-beta1/core/src/main/java/org/apache/oozie/action/hadoop/JavaActionExecutor.java#L1578

                    final boolean configuredValue = ConfigurationService.getBooleanOrDefault(launcherConf, configurationKey, defaultValue);
            

            This is not available in launcherConf. To make it available in launcherConf, it has to be prefixed with oozie.launcher.
            After I set oozie.launcher.oozie.action.mapreduce.needed.for.spark to true (or oozie.launcher.oozie.action.mapreduce.needed.for to true like I mentioned in jira description). Then it put mapreduce jars in classpath.

            We need to fix this as well.

            what other classpath collisions did you encounter while trying to submit a Spark action?

            After I did above, there was collision due to servlet-api-2.5.jar.

            can you please attach workflow.xml of the Spark action? Submission mode standalone /
            yarn client / yarn cluster would be of extreme interest

            I used the out of box example with yarn-cluster mode. https://github.com/apache/oozie/tree/master/examples/src/main/apps/spark

            can you please tell the exact Spark version you want to submit the workflow?

            Spark version is spark-1.6.1-bin-hadoop2.6

            Spark relies on a lot of hadoop classes that are in hadoop-common (for example Configuration). Without those dependencies, it simply cannot work with Yarn / HDFS. For example, look at Spark documentation and source code:
            Documentation
            HadoopFileLinesReader

            True. But hadoop-common is already available via $HADOOP_COMMON_HOME/share/hadoop/common/*
            Following is from launch_container.sh (without adding mapreduce jars)

             
             export CLASSPATH="$PWD:$PWD/*:$HADOOP_CONF_DIR:$HADOOP_COMMON_HOME/share/hadoop/common/*:$HADOOP_COMMON_HOME/share/hadoop/common/lib/*:$HADOOP_HDFS_HOME/share/hadoop/hdfs/*:$HADOOP_HDFS_HOME/share/hadoop/hdfs/lib/*:$HADOOP_YARN_HOME/share/hadoop/ya    rn/*:$HADOOP_YARN_HOME/share/hadoop/yarn/lib/*"
            
            satishsaley Satish Saley added a comment - Satish Subhashrao Saley a couple of questions that makes reproduction more easy: did you set oozie.action.mapreduce.needed.for.spark to true? Yes. I tried setting oozie.action.mapreduce.needed.for.spark , but we read this property from launcherConf https://github.com/apache/oozie/blob/branch-5.0.0-beta1/core/src/main/java/org/apache/oozie/action/hadoop/JavaActionExecutor.java#L1578 final boolean configuredValue = ConfigurationService.getBooleanOrDefault(launcherConf, configurationKey, defaultValue); This is not available in launcherConf. To make it available in launcherConf, it has to be prefixed with oozie.launcher. After I set oozie.launcher.oozie.action.mapreduce.needed.for.spark to true (or oozie.launcher.oozie.action.mapreduce.needed.for to true like I mentioned in jira description). Then it put mapreduce jars in classpath. We need to fix this as well. what other classpath collisions did you encounter while trying to submit a Spark action? After I did above, there was collision due to servlet-api-2.5.jar. can you please attach workflow.xml of the Spark action? Submission mode standalone / yarn client / yarn cluster would be of extreme interest I used the out of box example with yarn-cluster mode. https://github.com/apache/oozie/tree/master/examples/src/main/apps/spark can you please tell the exact Spark version you want to submit the workflow? Spark version is spark-1.6.1-bin-hadoop2.6 Spark relies on a lot of hadoop classes that are in hadoop-common (for example Configuration). Without those dependencies, it simply cannot work with Yarn / HDFS. For example, look at Spark documentation and source code: Documentation HadoopFileLinesReader True. But hadoop-common is already available via $HADOOP_COMMON_HOME/share/hadoop/common/* Following is from launch_container.sh (without adding mapreduce jars) export CLASSPATH= "$PWD:$PWD/*:$HADOOP_CONF_DIR:$HADOOP_COMMON_HOME/share/hadoop/common/*:$HADOOP_COMMON_HOME/share/hadoop/common/lib/*:$HADOOP_HDFS_HOME/share/hadoop/hdfs/*:$HADOOP_HDFS_HOME/share/hadoop/hdfs/lib/*:$HADOOP_YARN_HOME/share/hadoop/ya rn/*:$HADOOP_YARN_HOME/share/hadoop/yarn/lib/*"

            satishsaley: If you set oozie.action.mapreduce.needed.for.spark to true in oozie-site.xml, the defaultValue will be true and MR jars will be added.

            Regarding the actual MapReduce dependency in SparkArgsExtractor, DistributedCache is required because of

            SparkMain.fixFsDefaultUrisAndFilterDuplicates(DistributedCache.getCacheFiles(actionConf));
            

            This specific line was introduced by OOZIE-2923 Improve Spark options parsing. However, if I look at the history of the Spark sharelib, DistirbutedCache was present from the beginning: see OOZIE-1983 Add spark action executor, then OOZIE-2547 introduced fixFsDefaultUris.

            It might be better to only add hadoop-mapreduce-client-core & hadoop-mapreduce-client-common jar instead of hadoop-common jar that brings too many things and causes issues like the one you described (and added to the classpath by the launcher script anyway). So I uploaded MR jars, removed hadoop common this, updated sharelib and the Spark workflow succeeded.

            However, in an earlier release, 4.2.0, Oozie required the spark-assembly jar to be present in the Spark ShareLib (see https://oozie.apache.org/docs/4.2.0/DG_SparkActionExtension.html). That assembly includes a lot of hadoop related classes.

            You are also right that HADOOP_COMMON_HOME is referenced when CLASSPATH is set in the launcher script.

            asasvari Attila Sasvári added a comment - satishsaley : If you set oozie.action.mapreduce.needed.for.spark to true in oozie-site.xml , the defaultValue will be true and MR jars will be added. Regarding the actual MapReduce dependency in SparkArgsExtractor , DistributedCache is required because of SparkMain.fixFsDefaultUrisAndFilterDuplicates(DistributedCache.getCacheFiles(actionConf)); This specific line was introduced by OOZIE-2923 Improve Spark options parsing . However, if I look at the history of the Spark sharelib, DistirbutedCache was present from the beginning: see OOZIE-1983 Add spark action executor , then OOZIE-2547 introduced fixFsDefaultUris. It might be better to only add hadoop-mapreduce-client-core & hadoop-mapreduce-client-common jar instead of hadoop-common jar that brings too many things and causes issues like the one you described (and added to the classpath by the launcher script anyway). So I uploaded MR jars, removed hadoop common this, updated sharelib and the Spark workflow succeeded. However, in an earlier release, 4.2.0, Oozie required the spark-assembly jar to be present in the Spark ShareLib (see https://oozie.apache.org/docs/4.2.0/DG_SparkActionExtension.html ). That assembly includes a lot of hadoop related classes. You are also right that HADOOP_COMMON_HOME is referenced when CLASSPATH is set in the launcher script.

            satishsaley, andras.piros, gezapeti can you take a look at the attached patch?

            Testing done:

            • mvn clean install assembly:single -DskipTests -Denforcer.skip=true -Dcheckstyle.skip=true -Dfindbugs.skip=true -DtargetJavaVersion=1.8 -DjavaVersion=1.8 -Dhadoop.version-2.6.0 -Puber
            • configured Oozie so that is talks with pseudo hadoop 2.6.
            • modified examples/apps/spark/workflow.xml so that it also includes <mode>
              <workflow-app xmlns='uri:oozie:workflow:0.5' name='SparkFileCopy'>
                  <start to='spark-node' />
                  <action name='spark-node'>
                      <spark xmlns="uri:oozie:spark-action:0.1">
                          <job-tracker>${jobTracker}</job-tracker>
                          <name-node>${nameNode}</name-node>
                          <prepare>
                              <delete path="${nameNode}/user/${wf:user()}/${examplesRoot}/output-data/spark"/>
                          </prepare>
                          <master>${master}</master>
                          <mode>${mode}</mode>
                          <name>Spark-FileCopy</name>
                          <class>org.apache.oozie.example.SparkFileCopy</class>
                          <jar>${nameNode}/user/${wf:user()}/${examplesRoot}/apps/spark/lib/oozie-examples.jar</jar>
                          <arg>${nameNode}/user/${wf:user()}/${examplesRoot}/input-data/text/data.txt</arg>
                          <arg>${nameNode}/user/${wf:user()}/${examplesRoot}/output-data/spark</arg>
                      </spark>
                      <ok to="end" />
                      <error to="fail" />
                  </action>
                  <kill name="fail">
                      <message>Workflow failed, error
                          message[${wf:errorMessage(wf:lastErrorNode())}]
                      </message>
                  </kill>
                  <end name='end' />
              </workflow-app>
              
            • CLUSTER mode with YARN: bin/oozie job -oozie http://localhost:11000/oozie -config examples/apps/pyspark/job.properties -run -DnameNode=hdfs://localhost:9000 -DjobTracker=localhost:8032 -Dmaster=yarn -Dmode=cluster, workflow succeeded
            • CLIENT mode with YARN: bin/oozie job -oozie http://localhost:11000/oozie -config examples/apps/pyspark/job.properties -run -DnameNode=hdfs://localhost:9000 -DjobTracker=localhost:8032 -Dmaster=yarn -Dmode=client, workflow succeeded
            • executed example workflows all except hive related ones, pyspark failed first because I have not uploaded required dependencies to Spark sharelib. After I set it up correctly, workflow succeeded.

            Note: If you do not specify mode in the Spark workflow when master set to yarn, executor will fail.

            asasvari Attila Sasvári added a comment - satishsaley , andras.piros , gezapeti can you take a look at the attached patch? Testing done: mvn clean install assembly:single -DskipTests -Denforcer.skip=true -Dcheckstyle.skip=true -Dfindbugs.skip=true -DtargetJavaVersion=1.8 -DjavaVersion=1.8 -Dhadoop.version-2.6.0 -Puber configured Oozie so that is talks with pseudo hadoop 2.6. modified  examples/apps/spark/workflow.xml so that it also includes <mode> <workflow-app xmlns= 'uri:oozie:workflow:0.5' name= 'SparkFileCopy' >     <start to= 'spark-node' />     <action name= 'spark-node' >         <spark xmlns= "uri:oozie:spark-action:0.1" >             <job-tracker> ${jobTracker} </job-tracker>             <name-node> ${nameNode} </name-node>             <prepare>                 <delete path= "${nameNode}/user/${wf:user()}/${examplesRoot}/output-data/spark" />             </prepare>             <master> ${master} </master>             <mode> ${mode} </mode>             <name> Spark-FileCopy </name>             <class> org.apache.oozie.example.SparkFileCopy </class>             <jar> ${nameNode}/user/${wf:user()}/${examplesRoot}/apps/spark/lib/oozie-examples.jar </jar>             <arg> ${nameNode}/user/${wf:user()}/${examplesRoot}/input-data/text/data.txt </arg>             <arg> ${nameNode}/user/${wf:user()}/${examplesRoot}/output-data/spark </arg>         </spark>         <ok to= "end" />         <error to= "fail" />     </action>     <kill name= "fail" >         <message> Workflow failed, error             message[${wf:errorMessage(wf:lastErrorNode())}]         </message>     </kill>     <end name= 'end' /> </workflow-app> CLUSTER mode with YARN: bin/oozie job -oozie http://localhost:11000/oozie -config examples/apps/pyspark/job.properties -run -DnameNode=hdfs://localhost:9000 -DjobTracker=localhost:8032 -Dmaster=yarn -Dmode=cluster , workflow succeeded CLIENT mode with YARN: bin/oozie job -oozie http://localhost:11000/oozie -config examples/apps/pyspark/job.properties -run -DnameNode=hdfs://localhost:9000 -DjobTracker=localhost:8032 -Dmaster=yarn -Dmode=client , workflow succeeded executed example workflows all except hive related ones, pyspark failed first because I have not uploaded required dependencies to Spark sharelib. After I set it up correctly, workflow succeeded. Note: If you do not specify mode in the Spark workflow when master set to yarn , executor will fail.

            Precommit build finished, but results were not reported here:

            Testing JIRA OOZIE-3159
            
            Cleaning local git workspace
            
            ----------------------------
            
            +1 PATCH_APPLIES
            +1 CLEAN
            -1 RAW_PATCH_ANALYSIS
                +1 the patch does not introduce any @author tags
                +1 the patch does not introduce any tabs
                +1 the patch does not introduce any trailing spaces
                +1 the patch does not introduce any line longer than 132
                -1 the patch does not add/modify any testcase
            +1 RAT
                +1 the patch does not seem to introduce new RAT warnings
            +1 JAVADOC
                +1 the patch does not seem to introduce new Javadoc warnings
            +1 COMPILE
                +1 HEAD compiles
                +1 patch compiles
                +1 the patch does not seem to introduce new javac warnings
            +1 There are no new bugs found in total.
             +1 There are no new bugs found in [docs].
             +1 There are no new bugs found in [sharelib/distcp].
             +1 There are no new bugs found in [sharelib/hive].
             +1 There are no new bugs found in [sharelib/spark].
             +1 There are no new bugs found in [sharelib/hive2].
             +1 There are no new bugs found in [sharelib/hcatalog].
             +1 There are no new bugs found in [sharelib/streaming].
             +1 There are no new bugs found in [sharelib/pig].
             +1 There are no new bugs found in [sharelib/sqoop].
             +1 There are no new bugs found in [sharelib/oozie].
             +1 There are no new bugs found in [examples].
             +1 There are no new bugs found in [client].
             +1 There are no new bugs found in [core].
             +1 There are no new bugs found in [tools].
             +1 There are no new bugs found in [server].
            +1 BACKWARDS_COMPATIBILITY
                +1 the patch does not change any JPA Entity/Colum/Basic/Lob/Transient annotations
                +1 the patch does not modify JPA files
            +1 TESTS
                Tests run: 2086
                Tests failed at first run:
            TestJavaActionExecutor#testCredentialsSkip
                For the complete list of flaky tests, see TEST-SUMMARY-FULL files.
            +1 DISTRO
                +1 distro tarball builds with the patch
            

            Regarding the failure to adding commend to JIRA - it is in the build log:

            Adding comment to JIRA
            Unable to log in to server: https://issues.apache.org/jira/rpc/soap/jirasoapservice-v2 with user: hadoopqa.
             Cause: (404)404
            
            test-patch exit code: 1
            
            asasvari Attila Sasvári added a comment - Precommit build finished, but results were not reported here: Testing JIRA OOZIE-3159 Cleaning local git workspace ---------------------------- +1 PATCH_APPLIES +1 CLEAN -1 RAW_PATCH_ANALYSIS +1 the patch does not introduce any @author tags +1 the patch does not introduce any tabs +1 the patch does not introduce any trailing spaces +1 the patch does not introduce any line longer than 132 -1 the patch does not add/modify any testcase +1 RAT +1 the patch does not seem to introduce new RAT warnings +1 JAVADOC +1 the patch does not seem to introduce new Javadoc warnings +1 COMPILE +1 HEAD compiles +1 patch compiles +1 the patch does not seem to introduce new javac warnings +1 There are no new bugs found in total. +1 There are no new bugs found in [docs]. +1 There are no new bugs found in [sharelib/distcp]. +1 There are no new bugs found in [sharelib/hive]. +1 There are no new bugs found in [sharelib/spark]. +1 There are no new bugs found in [sharelib/hive2]. +1 There are no new bugs found in [sharelib/hcatalog]. +1 There are no new bugs found in [sharelib/streaming]. +1 There are no new bugs found in [sharelib/pig]. +1 There are no new bugs found in [sharelib/sqoop]. +1 There are no new bugs found in [sharelib/oozie]. +1 There are no new bugs found in [examples]. +1 There are no new bugs found in [client]. +1 There are no new bugs found in [core]. +1 There are no new bugs found in [tools]. +1 There are no new bugs found in [server]. +1 BACKWARDS_COMPATIBILITY +1 the patch does not change any JPA Entity/Colum/Basic/Lob/Transient annotations +1 the patch does not modify JPA files +1 TESTS Tests run: 2086 Tests failed at first run: TestJavaActionExecutor#testCredentialsSkip For the complete list of flaky tests, see TEST-SUMMARY-FULL files. +1 DISTRO +1 distro tarball builds with the patch Regarding the failure to adding commend to JIRA - it is in the build log: Adding comment to JIRA Unable to log in to server: https: //issues.apache.org/jira/rpc/soap/jirasoapservice-v2 with user: hadoopqa. Cause: (404)404 test-patch exit code: 1
            andras.piros Andras Piros added a comment -

            asasvari thanks for the patch!

            Couple of reflections:

            • I think inside core/pom.xml the second hadoop-mapreduce-client-core dependency w/ version element should go to parent pom.xml
            • I'm wondering whether w/ this patch OOZIE-3161 is also resolved (servlet-api-2.5.jar discrepancy)
            andras.piros Andras Piros added a comment - asasvari thanks for the patch! Couple of reflections: I think inside core/pom.xml the second hadoop-mapreduce-client-core dependency w/ version element should go to parent pom.xml I'm wondering whether w/ this patch OOZIE-3161 is also resolved ( servlet-api-2.5.jar discrepancy)

            andras.piros thanks for the review. Are you looking at the second patch?
            https://issues.apache.org/jira/secure/attachment/12906130/OOZIE-3159-002.patch ? It changes:

            pom.xml
            sharelib/spark/pom.xml
            

            I am not sure about the second remark. mvn dependency:tree shows mapreduce-client-core jar transitively pulls in servlet-api-2.5.jar via hadoop-yarn-common:

            2254 [INFO] +- org.apache.hadoop:hadoop-mapreduce-client-core:jar:2.6.0:compile
            2255 [INFO] |  +- org.apache.hadoop:hadoop-yarn-common:jar:2.6.0:compile
            2256 [INFO] |  |  +- org.apache.hadoop:hadoop-yarn-api:jar:2.6.0:compile
            ...
            2262 [INFO] |  |  +- javax.servlet:servlet-api:jar:2.5:compile
            

            and also via the oozie-core jar via jetty-server:

            [INFO] |  +- org.eclipse.jetty:jetty-server:jar:9.2.19.v20160908:provided
            [INFO] |  |  +- javax.servlet:javax.servlet-api:jar:3.1.0:provided
            

            These dependencies shall be excluded to avoid conflicts like OOZIE-3161 (where servlet-api is also pulled in via oozie-core). So yes, fixing this should resolve OOZIE-3161 too.

            As a next step: I will exclude servlet-api and upload a new patch.

            Andras: Do you see other dependencies that might collide with Spark at runtime?

            asasvari Attila Sasvári added a comment - andras.piros thanks for the review. Are you looking at the second patch? https://issues.apache.org/jira/secure/attachment/12906130/OOZIE-3159-002.patch ? It changes: pom.xml sharelib/spark/pom.xml I am not sure about the second remark. mvn dependency:tree shows mapreduce-client-core jar transitively pulls in servlet-api-2.5.jar via hadoop-yarn-common : 2254 [INFO] +- org.apache.hadoop:hadoop-mapreduce-client-core:jar:2.6.0:compile 2255 [INFO] | +- org.apache.hadoop:hadoop-yarn-common:jar:2.6.0:compile 2256 [INFO] | | +- org.apache.hadoop:hadoop-yarn-api:jar:2.6.0:compile ... 2262 [INFO] | | +- javax.servlet:servlet-api:jar:2.5:compile and also via the oozie-core jar via jetty-server : [INFO] | +- org.eclipse.jetty:jetty-server:jar:9.2.19.v20160908:provided [INFO] | | +- javax.servlet:javax.servlet-api:jar:3.1.0:provided These dependencies shall be excluded to avoid conflicts like OOZIE-3161 (where servlet-api is also pulled in via oozie-core). So yes, fixing this should resolve OOZIE-3161 too. As a next step: I will exclude servlet-api and upload a new patch. Andras: Do you see other dependencies that might collide with Spark at runtime?

            In the attached patch 003, I excluded javax-servlet servlet api in Spark sharelib (exclusion added to oozie-core, spark-streaming-flume)

            Tests done:

            • Uploaded sharelib. Verified servlet api in HDFS:
              hdfs://localhost:9000/user/asasvari/share/lib/lib_20180116150723/spark/javax.servlet-3.0.0.v201112011016.jar
            • executed test workflow -> succeeded
            asasvari Attila Sasvári added a comment - In the attached patch 003, I excluded javax-servlet servlet api in Spark sharelib (exclusion added to oozie-core, spark-streaming-flume) Tests done: Uploaded sharelib. Verified servlet api in HDFS: hdfs://localhost:9000/user/asasvari/share/lib/lib_20180116150723/spark/javax.servlet-3.0.0.v201112011016.jar executed test workflow -> succeeded
            andras.piros Andras Piros added a comment - - edited

            asasvari thanks for the new patch! It seems to me that right now after a bin/mkdistro there is only one file w/ javax.servlet.FilterRegistration.class inside oozie-sharelib-spark.

            Looks good to me +1

            andras.piros Andras Piros added a comment - - edited asasvari thanks for the new patch! It seems to me that right now after a bin/mkdistro there is only one file w/ javax.servlet.FilterRegistration.class inside oozie-sharelib-spark . Looks good to me +1

            Now only javax.servlet-3.0.0.v201112011016.jar is present in Spark sharelib jars:

            find share/lib/spark -name "*.jar" -exec sh -c '(jar -tvf {} | grep FilterRegistration 1>/dev/null && echo {})' \; 
            
            share/lib/spark/javax.servlet-3.0.0.v201112011016.jar

            satishsaley can you take a look? It shall solve OOZIE-3161 too. Spark example workflows succeeded on pseudo Hadoop 2.6.0.

            asasvari Attila Sasvári added a comment - Now only javax.servlet-3.0.0.v201112011016.jar is present in Spark sharelib jars: find share/lib/spark -name "*.jar" -exec sh -c '(jar -tvf {} | grep FilterRegistration 1>/dev/ null && echo {})' \;  share/lib/spark/javax.servlet-3.0.0.v201112011016.jar satishsaley can you take a look? It shall solve  OOZIE-3161  too. Spark example workflows succeeded on pseudo Hadoop 2.6.0.
            satishsaley Satish Saley added a comment -

            Exclusion of <groupId>javax.servlet</groupId><artifactId>servlet-api</artifactId> is no op for <artifactId>spark-streaming-flume_${spark.scala.binary.version}</artifactId> as it is not fetching servlet-api. Please remove it from there.

            Exclusion of <groupId>javax.servlet</groupId><artifactId>servlet-api</artifactId> is no op for <artifactId>oozie-core</artifactId> as it is a provided dependency. Please remove it from there.

            It shall solve OOZIE-3161 too.

            That issue is with jetty:servlet-api-2.5:jar which is fetched by org.mortbay.jetty:jetty:jar. So there I excluded it from <artifactId>spark-streaming-flume_${spark.scala.binary.version}</artifactId>
            In master branch as well as 5.0.0-beta branch, we already excluded org.mortbay.jetty:jetty:jar from <artifactId>spark-streaming-flume_${spark.scala.binary.version}</artifactId>.
            So, I will commit OOZIE-3161 only on branch-4.3.

            satishsaley Satish Saley added a comment - Exclusion of <groupId>javax.servlet</groupId><artifactId>servlet-api</artifactId> is no op for <artifactId>spark-streaming-flume_${spark.scala.binary.version}</artifactId> as it is not fetching servlet-api. Please remove it from there. Exclusion of <groupId>javax.servlet</groupId><artifactId>servlet-api</artifactId> is no op for <artifactId>oozie-core</artifactId> as it is a provided dependency. Please remove it from there. It shall solve OOZIE-3161 too. That issue is with jetty:servlet-api-2.5:jar which is fetched by org.mortbay.jetty:jetty:jar . So there I excluded it from <artifactId>spark-streaming-flume_${spark.scala.binary.version}</artifactId> In master branch as well as 5.0.0-beta branch, we already excluded org.mortbay.jetty:jetty:jar from <artifactId>spark-streaming-flume_${spark.scala.binary.version}</artifactId> . So, I will commit OOZIE-3161 only on branch-4.3.
            gezapeti GĂ©zapeti added a comment -

            I've built a distro with and without the patch and the new sharelib looks better. I'm glad it fixed Spark execution on a pseudodistributed Hadoop

            +1

             

            gezapeti GĂ©zapeti added a comment - I've built a distro with and without the patch and the new sharelib looks better. I'm glad it fixed Spark execution on a pseudodistributed Hadoop +1  

            gezapeti thanks for the review, committed to master

            asasvari Attila Sasvári added a comment - gezapeti thanks for the review, committed to master

            Closing issue; Oozie 5.0.0-beta1 is released

            asasvari Attila Sasvári added a comment - Closing issue; Oozie 5.0.0-beta1 is released

            People

              asasvari Attila Sasvári
              satishsaley Satish Saley
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: