Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-5929

Pyspark: Register a pip requirements file with spark_context

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Resolved
    • Minor
    • Resolution: Won't Fix
    • None
    • None
    • PySpark
    • None

    Description

      I've been doing a lot of dependency work with shipping dependencies to workers as it is non-trivial for me to have my workers include the proper dependencies in their own environments.

      To circumvent this, I added a addRequirementsFile() method that takes a pip requirements file, downloads the packages, repackages them to be registered with addPyFiles and ship them to workers.

      Here is a comparison of what I've done on the Palantir fork

      https://github.com/buckheroux/spark/compare/palantir:master...master

      Attachments

        Activity

          People

            Unassigned Unassigned
            buckheroux buckhx
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: