Uploaded image for project: 'Oozie'
  1. Oozie
  2. OOZIE-3321

PySpark example fails

    XMLWordPrintableJSON

    Details

    • Type: Bug
    • Status: Closed
    • Priority: Minor
    • Resolution: Fixed
    • Affects Version/s: 5.0.0
    • Fix Version/s: 5.1.0
    • Component/s: examples
    • Labels:
      None

      Description

      The PySpark example fails with the following exception:

      ACTION[0000001-180806145017687-oozie-oozi-W@spark-node] Launcher exception: Missing py4j and/or pyspark zip files. Please add them to the lib folder or to the Spark sharelib.
      org.apache.oozie.action.hadoop.OozieActionConfiguratorException: Missing py4j and/or pyspark zip files. Please add them to the lib folder or to the Spark sharelib.
      	at org.apache.oozie.action.hadoop.SparkMain.getMatchingPyFile(SparkMain.java:151)
      	at org.apache.oozie.action.hadoop.SparkMain.createPySparkLibFolder(SparkMain.java:132)
      	at org.apache.oozie.action.hadoop.SparkMain.run(SparkMain.java:77)
      	at org.apache.oozie.action.hadoop.LauncherMain.run(LauncherMain.java:101)
      	at org.apache.oozie.action.hadoop.SparkMain.main(SparkMain.java:60)
      	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
      	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
      	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
      	at java.lang.reflect.Method.invoke(Method.java:498)
      	at org.apache.oozie.action.hadoop.LauncherAM.runActionMain(LauncherAM.java:410)
      	at org.apache.oozie.action.hadoop.LauncherAM.access$300(LauncherAM.java:55)
      	at org.apache.oozie.action.hadoop.LauncherAM$2.run(LauncherAM.java:223)
      	at java.security.AccessController.doPrivileged(Native Method)
      	at javax.security.auth.Subject.doAs(Subject.java:422)
      	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1692)
      	at org.apache.oozie.action.hadoop.LauncherAM.run(LauncherAM.java:217)
      	at org.apache.oozie.action.hadoop.LauncherAM$1.run(LauncherAM.java:153)
      	at java.security.AccessController.doPrivileged(Native Method)
      	at javax.security.auth.Subject.doAs(Subject.java:422)
      	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1692)
      	at org.apache.oozie.action.hadoop.LauncherAM.main(LauncherAM.java:141)
      

      This is because py4j-0.9-src.zip and pyspark.zip are not available. They are needed for the example to run, so we should either add them to the sharelib or if they are not important enough to be in the sharelib, we should add them to the example's lib directory. Currently these two files can be found in the codebase under sharelib/spark/src/test/resources. We should also decide what to do with these files there, to avoid duplication of resource files.

      If I add the files manually to either the sharelib or the lib directory of the example, the example runs successfilly.

        Attachments

        1. OOZIE-3321.1.patch
          3 kB
          Daniel Becker
        2. OOZIE-3321.2.patch
          8 kB
          Daniel Becker
        3. OOZIE-3321.3.patch
          5 kB
          Daniel Becker

          Activity

            People

            • Assignee:
              daniel.becker Daniel Becker
              Reporter:
              daniel.becker Daniel Becker
            • Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: