Uploaded image for project: 'TOREE'
  1. TOREE
  2. TOREE-310

Allow override of python executable used for pyspark

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Closed
    • Major
    • Resolution: Fixed
    • None
    • 0.1.0
    • None
    • None

    Description

      We're using virtualenvs when running pyspark, and it would be great to be able to use a virtualenv as the python executable used by the Spark Driver (i.e. --master yarn-client). This value is currently hard-coded to python in org/apache/toree/kernel/interpreter/pyspark/PySparkProcess.scala.

      I have a branch on my repo which adds an optional kernel parameter PYTHON_EXEC:

      ...
          "SPARK_HOME": "/usr/lib/spark",
          "PYTHON_EXEC" : "/usr/local/python/virtualenvs/myvenv/bin/python",
      ...
      

      If PYTHON_EXEC is unspecified, the default of python is used.

      Here's the diff of the branch, please let me know if it's ok for me to issue a PR against the main repo: https://github.com/ericchang/incubator-toree/compare/ericchang:master...custom-python-exec

      Attachments

        Activity

          People

            Unassigned Unassigned
            eghchang Eric Chang
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: