Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-33530

Support --archives option natively

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • 3.1.0
    • 3.1.0
    • Spark Core
    • None

    Description

      Currently, spark-submit --archives and spark.yarn.dist.archives configuration are only supported in Yarn modes:

      spark-submit --help
      ...
       Spark on YARN only:
        --queue QUEUE_NAME          The YARN queue to submit to (Default: "default").
        --archives ARCHIVES         Comma separated list of archives to be extracted into the
                                    working directory of each executor.
      

      This is actually critical for PySpark to support shipping other packages together, see also https://hyukjin-spark.readthedocs.io/en/stable/user_guide/python_packaging.html#using-zipped-virtual-environment.

      Due to this missing feature, PySpark cannot support conda env to ship other packages together.

      Attachments

        Activity

          People

            gurwls223 Hyukjin Kwon
            gurwls223 Hyukjin Kwon
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: