Description
I have some random class I want access to from an Spark shell, say com.cloudera.science.throwaway.ThrowAway. You can find the specific example I used here:
https://gist.github.com/laserson/e9e3bd265e1c7a896652
I packaged it as throwaway.jar.
If I then run bin/spark-shell like so:
bin/spark-shell --master local[1] --jars throwaway.jar
I can execute
val a = new com.cloudera.science.throwaway.ThrowAway()
Successfully.
I now run PySpark like so:
PYSPARK_DRIVER_PYTHON=ipython bin/pyspark --master local[1] --jars throwaway.jar
which gives me an error when I try to instantiate the class through Py4J:
In [1]: sc._jvm.com.cloudera.science.throwaway.ThrowAway() --------------------------------------------------------------------------- Py4JError Traceback (most recent call last) <ipython-input-1-4eedbe023c29> in <module>() ----> 1 sc._jvm.com.cloudera.science.throwaway.ThrowAway() /Users/laserson/repos/spark/python/lib/py4j-0.8.2.1-src.zip/py4j/java_gateway.py in __getattr__(self, name) 724 def __getattr__(self, name): 725 if name == '__call__': --> 726 raise Py4JError('Trying to call a package.') 727 new_fqn = self._fqn + '.' + name 728 command = REFLECTION_COMMAND_NAME +\ Py4JError: Trying to call a package.
However, if I explicitly add the --driver-class-path to add the same jar
PYSPARK_DRIVER_PYTHON=ipython bin/pyspark --master local[1] --jars throwaway.jar --driver-class-path throwaway.jar
it works
In [1]: sc._jvm.com.cloudera.science.throwaway.ThrowAway() Out[1]: JavaObject id=o18
However, the docs state that --jars should also set the driver class path.
Attachments
Issue Links
- is duplicated by
-
SPARK-6047 pyspark - class loading on driver failing with --jars and --packages
- Resolved
-
SPARK-5975 SparkSubmit --jars not present on driver in python
- Closed
-
SPARK-6027 Make KafkaUtils work in Python with kafka-assembly provided as --jar or maven package provided as --packages
- Closed
-
SPARK-6301 Unable to load external jars while submitting Spark Job
- Closed
- is part of
-
SPARK-6047 pyspark - class loading on driver failing with --jars and --packages
- Resolved
- links to