Uploaded image for project: 'Zeppelin'
  1. Zeppelin
  2. ZEPPELIN-1883

Can't import packages requested by SPARK_SUBMIT_OPTION in pyspark

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • None
    • 0.7.0
    • pySpark
    • None

    Description

      Zeppelin pyspark can't import submitted packages by SPARK_SUBMIT_OPTION. For example,

      // conf/zeppelin-env.sh
      ...
      
      export SPARK_HOME="~/github/apache-spark/1.6.2-bin-hadoop2.6"
      export SPARK_SUBMIT_OPTIONS="--packages com.datastax.spark:spark-cassandra-connector_2.10:1.6.2,TargetHolding:pyspark-cassandra:0.3.5 --exclude-packages org.slf4j:slf4j-api"
      
      ...
      

      And then try import that pyspark cassandra module in zeppelin pyspark interpreter

      import pyspark_cassandra
      
      
      Traceback (most recent call last):
        File "/var/folders/lr/8g9y625n5j39rz6qhkg8s6640000gn/T/zeppelin_pyspark-5266742863961917074.py", line 267, in <module>
          raise Exception(traceback.format_exc())
      Exception: Traceback (most recent call last):
        File "/var/folders/lr/8g9y625n5j39rz6qhkg8s6640000gn/T/zeppelin_pyspark-5266742863961917074.py", line 265, in <module>
          exec(code)
        File "<stdin>", line 1, in <module>
      ImportError: No module named pyspark_cassandra
      

      Attachments

        Issue Links

          Activity

            People

              1ambda Hoon Park
              1ambda Hoon Park
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: