Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-19627

pyspark call jvm function defined by ourselves

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Invalid
    • 1.6.1
    • None
    • Deploy
    • None

    Description

      hi, I have a question that pyspark couldn't execute suceess by call jvm's function defined by myself, please view the code below:

      from pyspark import SparkConf,SparkContext
      from py4j.java_gateway import java_import
      if _name_ == "_main_":

      1. conf = SparkConf().setAppName("testing")
      2. sc = SparkContext(conf=conf)
        sc = SparkContext(appName="Py4jTesting")

      def foo:
      java_import(sc._jvm, "Calculate")
      func = sc._jvm.Calculate()
      func.sqAdd

      rdd = sc.parallelize([1, 2, 3])

      result = rdd.map(foo).collect()
      print("$$$$$$$$$$$$$$$$$$$$$$")
      print(result)

      the result shows as below ,who can help me?

      Traceback (most recent call last):
      File "/home/manager/data/software/mytest/kehao/driver.py", line 19, in <module>
      result = rdd.map(foo).collect()
      File "/home/manager/data/software/spark-1.6.1-bin-hadoop2.6/python/lib/pyspark.zip/pyspark/rdd.py", line 771, in collect
      File "/home/manager/data/software/spark-1.6.1-bin-hadoop2.6/python/lib/pyspark.zip/pyspark/rdd.py", line 2379, in _jrdd
      File "/home/manager/data/software/spark-1.6.1-bin-hadoop2.6/python/lib/pyspark.zip/pyspark/rdd.py", line 2299, in _prepare_for_python_RDD
      File "/home/manager/data/software/spark-1.6.1-bin-hadoop2.6/python/lib/pyspark.zip/pyspark/serializers.py", line 428, in dumps
      File "/home/manager/data/software/spark-1.6.1-bin-hadoop2.6/python/lib/pyspark.zip/pyspark/cloudpickle.py", line 646, in dumps
      File "/home/manager/data/software/spark-1.6.1-bin-hadoop2.6/python/lib/pyspark.zip/pyspark/cloudpickle.py", line 107, in dump
      File "/usr/lib/python3.4/pickle.py", line 412, in dump
      self.save(obj)
      File "/usr/lib/python3.4/pickle.py", line 479, in save
      f(self, obj) # Call unbound method with explicit self
      File "/usr/lib/python3.4/pickle.py", line 744, in save_tuple
      save(element)
      File "/usr/lib/python3.4/pickle.py", line 479, in save
      f(self, obj) # Call unbound method with explicit self
      File "/home/manager/data/software/spark-1.6.1-bin-hadoop2.6/python/lib/pyspark.zip/pyspark/cloudpickle.py", line 199, in save_function
      File "/home/manager/data/software/spark-1.6.1-bin-hadoop2.6/python/lib/pyspark.zip/pyspark/cloudpickle.py", line 236, in save_function_tuple
      File "/usr/lib/python3.4/pickle.py", line 479, in save
      f(self, obj) # Call unbound method with explicit self
      File "/usr/lib/python3.4/pickle.py", line 729, in save_tuple
      save(element)
      File "/usr/lib/python3.4/pickle.py", line 479, in save
      f(self, obj) # Call unbound method with explicit self
      File "/usr/lib/python3.4/pickle.py", line 774, in save_list
      self._batch_appends(obj)
      File "/usr/lib/python3.4/pickle.py", line 801, in _batch_appends
      save(tmp[0])
      File "/usr/lib/python3.4/pickle.py", line 479, in save
      f(self, obj) # Call unbound method with explicit self
      File "/home/manager/data/software/spark-1.6.1-bin-hadoop2.6/python/lib/pyspark.zip/pyspark/cloudpickle.py", line 193, in save_function
      File "/home/manager/data/software/spark-1.6.1-bin-hadoop2.6/python/lib/pyspark.zip/pyspark/cloudpickle.py", line 241, in save_function_tuple
      File "/usr/lib/python3.4/pickle.py", line 479, in save
      f(self, obj) # Call unbound method with explicit self
      File "/usr/lib/python3.4/pickle.py", line 814, in save_dict
      self._batch_setitems(obj.items())
      File "/usr/lib/python3.4/pickle.py", line 840, in _batch_setitems
      save(v)
      File "/usr/lib/python3.4/pickle.py", line 499, in save
      rv = reduce(self.proto)
      File "/home/manager/data/software/spark-1.6.1-bin-hadoop2.6/python/lib/pyspark.zip/pyspark/context.py", line 268, in _getnewargs_
      Exception: It appears that you are attempting to reference SparkContext from a broadcast variable, action, or transformation. SparkContext can only be used on the driver, not in code that it run on workers. For more information, see SPARK-5063

      Attachments

        Activity

          People

            Unassigned Unassigned
            kehao kehao
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: