Description
From SPARK-1920 and SPARK-1520 we know PySpark on Yarn can not work when the assembly jar are package by JDK 1.7+, so ship pyspark archives to executors by Yarn with --py-files. The pyspark archives name must contains "spark-pyspark".
1st: zip pyspark to spark-pyspark_2.10.zip
2nd:./bin/spark-submit --master yarn-client/yarn-cluster --py-files spark-pyspark_2.10.zip app.py args
Attachments
Issue Links
- is duplicated by
-
SPARK-1920 Spark JAR compiled with Java 7 leads to PySpark not working in YARN
- Resolved
- is related to
-
SPARK-1920 Spark JAR compiled with Java 7 leads to PySpark not working in YARN
- Resolved
-
SPARK-8646 PySpark does not run on YARN if master not provided in command line
- Closed
- relates to
-
SPARK-6797 Add support for YARN cluster mode
- Resolved
-
ZEPPELIN-18 Running pyspark without deploying python libraries to every yarn node
- Resolved
- links to