Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-8296

Not able to load Dataframe using Python throws py4j.protocol.Py4JJavaError

    Details

    • Type: Bug
    • Status: Closed
    • Priority: Major
    • Resolution: Done
    • Affects Version/s: 1.3.1
    • Fix Version/s: 1.3.1
    • Component/s: PySpark, SQL
    • Labels:
    • Environment:

      MAC OS

      Description

      While trying to load a json file using sqlcontext in prebuilt spark-1.3.1-bin-hadoop2.4 version, it throws py4j.protocol.Py4JJavaError

      from pyspark.sql import SQLContext
      from pyspark import SparkContext

      sc = SparkContext()
      sqlContext = SQLContext(sc)

      1. Create the DataFrame
        df = sqlContext.jsonFile("changes.json")
      1. Show the content of the DataFrame
        df.show()

      Error thrown -

      File "/Users/abhishekchoudhary/Work/python/evolveML/kaggle/avirto/test.py", line 11, in <module>
      df = sqlContext.jsonFile("changes.json")
      File "/Users/abhishekchoudhary/bigdata/cdh5.2.0/spark-1.3.1/python/pyspark/sql/context.py", line 377, in jsonFile
      df = self._ssql_ctx.jsonFile(path, samplingRatio)
      File "/Users/abhishekchoudhary/bigdata/cdh5.2.0/spark-1.3.1/python/lib/py4j-0.8.2.1-src.zip/py4j/java_gateway.py", line 538, in _call_
      File "/Users/abhishekchoudhary/bigdata/cdh5.2.0/spark-1.3.1/python/lib/py4j-0.8.2.1-src.zip/py4j/protocol.py", line 300, in get_return_value
      py4j.protocol.Py4JJavaError

      On checking through the source code, I found that 'gateway_client' is not valid .

        Attachments

          Activity

            People

            • Assignee:
              Unassigned
              Reporter:
              buntha ABHISHEK CHOUDHARY
            • Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: