Description
Looks like something simple got missed in the Java layer?
>>> from pyspark.sql import SQLContext >>> sqlContext = SQLContext(sc) >>> raw = sc.parallelize(['{"a": 5}']) >>> events = sqlContext.jsonRDD(raw) >>> events.printSchema() root |-- a: IntegerType >>> events.cache() PythonRDD[45] at RDD at PythonRDD.scala:37 >>> events.unpersist() Traceback (most recent call last): File "<stdin>", line 1, in <module> File "/root/spark/python/pyspark/sql.py", line 440, in unpersist self._jschema_rdd.unpersist() File "/root/spark/python/lib/py4j-0.8.1-src.zip/py4j/java_gateway.py", line 537, in __call__ File "/root/spark/python/lib/py4j-0.8.1-src.zip/py4j/protocol.py", line 304, in get_return_value py4j.protocol.Py4JError: An error occurred while calling o108.unpersist. Trace: py4j.Py4JException: Method unpersist([]) does not exist at py4j.reflection.ReflectionEngine.getMethod(ReflectionEngine.java:333) at py4j.reflection.ReflectionEngine.getMethod(ReflectionEngine.java:342) at py4j.Gateway.invoke(Gateway.java:251) at py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:132) at py4j.commands.CallCommand.execute(CallCommand.java:79) at py4j.GatewayConnection.run(GatewayConnection.java:207) at java.lang.Thread.run(Thread.java:745) >>> events.unpersist <bound method SchemaRDD.unpersist of PythonRDD[45] at RDD at PythonRDD.scala:37>
Note that the unpersist method exists but cannot be called without raising the shown error.
This is on 1.0.2-rc1.
Attachments
Issue Links
- is related to
-
SPARK-3500 coalesce() and repartition() of SchemaRDD is broken
- Resolved
- links to