Details
-
Bug
-
Status: Resolved
-
Major
-
Resolution: Invalid
-
1.6.1
-
None
-
None
Description
hi, I have a question that pyspark couldn't execute suceess by call jvm's function defined by myself, please view the code below:
from pyspark import SparkConf,SparkContext
from py4j.java_gateway import java_import
if _name_ == "_main_":
- conf = SparkConf().setAppName("testing")
- sc = SparkContext(conf=conf)
sc = SparkContext(appName="Py4jTesting")
def foo:
java_import(sc._jvm, "Calculate")
func = sc._jvm.Calculate()
func.sqAdd
rdd = sc.parallelize([1, 2, 3])
result = rdd.map(foo).collect()
print("$$$$$$$$$$$$$$$$$$$$$$")
print(result)
the result shows as below ,who can help me?
Traceback (most recent call last):
File "/home/manager/data/software/mytest/kehao/driver.py", line 19, in <module>
result = rdd.map(foo).collect()
File "/home/manager/data/software/spark-1.6.1-bin-hadoop2.6/python/lib/pyspark.zip/pyspark/rdd.py", line 771, in collect
File "/home/manager/data/software/spark-1.6.1-bin-hadoop2.6/python/lib/pyspark.zip/pyspark/rdd.py", line 2379, in _jrdd
File "/home/manager/data/software/spark-1.6.1-bin-hadoop2.6/python/lib/pyspark.zip/pyspark/rdd.py", line 2299, in _prepare_for_python_RDD
File "/home/manager/data/software/spark-1.6.1-bin-hadoop2.6/python/lib/pyspark.zip/pyspark/serializers.py", line 428, in dumps
File "/home/manager/data/software/spark-1.6.1-bin-hadoop2.6/python/lib/pyspark.zip/pyspark/cloudpickle.py", line 646, in dumps
File "/home/manager/data/software/spark-1.6.1-bin-hadoop2.6/python/lib/pyspark.zip/pyspark/cloudpickle.py", line 107, in dump
File "/usr/lib/python3.4/pickle.py", line 412, in dump
self.save(obj)
File "/usr/lib/python3.4/pickle.py", line 479, in save
f(self, obj) # Call unbound method with explicit self
File "/usr/lib/python3.4/pickle.py", line 744, in save_tuple
save(element)
File "/usr/lib/python3.4/pickle.py", line 479, in save
f(self, obj) # Call unbound method with explicit self
File "/home/manager/data/software/spark-1.6.1-bin-hadoop2.6/python/lib/pyspark.zip/pyspark/cloudpickle.py", line 199, in save_function
File "/home/manager/data/software/spark-1.6.1-bin-hadoop2.6/python/lib/pyspark.zip/pyspark/cloudpickle.py", line 236, in save_function_tuple
File "/usr/lib/python3.4/pickle.py", line 479, in save
f(self, obj) # Call unbound method with explicit self
File "/usr/lib/python3.4/pickle.py", line 729, in save_tuple
save(element)
File "/usr/lib/python3.4/pickle.py", line 479, in save
f(self, obj) # Call unbound method with explicit self
File "/usr/lib/python3.4/pickle.py", line 774, in save_list
self._batch_appends(obj)
File "/usr/lib/python3.4/pickle.py", line 801, in _batch_appends
save(tmp[0])
File "/usr/lib/python3.4/pickle.py", line 479, in save
f(self, obj) # Call unbound method with explicit self
File "/home/manager/data/software/spark-1.6.1-bin-hadoop2.6/python/lib/pyspark.zip/pyspark/cloudpickle.py", line 193, in save_function
File "/home/manager/data/software/spark-1.6.1-bin-hadoop2.6/python/lib/pyspark.zip/pyspark/cloudpickle.py", line 241, in save_function_tuple
File "/usr/lib/python3.4/pickle.py", line 479, in save
f(self, obj) # Call unbound method with explicit self
File "/usr/lib/python3.4/pickle.py", line 814, in save_dict
self._batch_setitems(obj.items())
File "/usr/lib/python3.4/pickle.py", line 840, in _batch_setitems
save(v)
File "/usr/lib/python3.4/pickle.py", line 499, in save
rv = reduce(self.proto)
File "/home/manager/data/software/spark-1.6.1-bin-hadoop2.6/python/lib/pyspark.zip/pyspark/context.py", line 268, in _getnewargs_
Exception: It appears that you are attempting to reference SparkContext from a broadcast variable, action, or transformation. SparkContext can only be used on the driver, not in code that it run on workers. For more information, see SPARK-5063