[ZEPPELIN-1883] Can't import packages requested by SPARK_SUBMIT_OPTION in pyspark - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Bug
Status: Resolved
Priority: Major
Resolution: Fixed
Affects Version/s: None
Fix Version/s: 0.7.0
Component/s: pySpark
Labels:
None

Description

Zeppelin pyspark can't import submitted packages by SPARK_SUBMIT_OPTION. For example,

// conf/zeppelin-env.sh
...

export SPARK_HOME="~/github/apache-spark/1.6.2-bin-hadoop2.6"
export SPARK_SUBMIT_OPTIONS="--packages com.datastax.spark:spark-cassandra-connector_2.10:1.6.2,TargetHolding:pyspark-cassandra:0.3.5 --exclude-packages org.slf4j:slf4j-api"

...

And then try import that pyspark cassandra module in zeppelin pyspark interpreter

import pyspark_cassandra


Traceback (most recent call last):
  File "/var/folders/lr/8g9y625n5j39rz6qhkg8s6640000gn/T/zeppelin_pyspark-5266742863961917074.py", line 267, in <module>
    raise Exception(traceback.format_exc())
Exception: Traceback (most recent call last):
  File "/var/folders/lr/8g9y625n5j39rz6qhkg8s6640000gn/T/zeppelin_pyspark-5266742863961917074.py", line 265, in <module>
    exec(code)
  File "<stdin>", line 1, in <module>
ImportError: No module named pyspark_cassandra

Attachments

Issue Links

is related to

ZEPPELIN-1741 JAR's specified with spark.jars in spark-defaults.conf does not affect %pyspark interpreter

Open

links to

GitHub Pull Request #1831

Activity

People

Assignee:: Hoon Park

Reporter:: Hoon Park

Votes:: 0 Vote for this issue

Watchers:: 3 Start watching this issue

Dates

Created:: 02/Jan/17 04:52

Updated:: 14/Jan/17 00:25

Resolved:: 14/Jan/17 00:25