Details
Description
from pyspark.sql import SQLContext sqlCtx = SQLContext(sc) # size of cluster impacts where this breaks (i.e. 2**15 vs 2**2) data = sc.parallelize([{'name': 'index', 'value': 0}] * 2**20) sdata = sqlCtx.inferSchema(data) sdata.first()
result: note - result returned as well as error
>>> sdata.first() 14/07/18 12:10:25 INFO SparkContext: Starting job: runJob at PythonRDD.scala:290 14/07/18 12:10:25 INFO DAGScheduler: Got job 43 (runJob at PythonRDD.scala:290) with 1 output partitions (allowLocal=true) 14/07/18 12:10:25 INFO DAGScheduler: Final stage: Stage 52(runJob at PythonRDD.scala:290) 14/07/18 12:10:25 INFO DAGScheduler: Parents of final stage: List() 14/07/18 12:10:25 INFO DAGScheduler: Missing parents: List() 14/07/18 12:10:25 INFO DAGScheduler: Computing the requested partition locally 14/07/18 12:10:25 INFO PythonRDD: Times: total = 45, boot = 3, init = 40, finish = 2 14/07/18 12:10:25 INFO SparkContext: Job finished: runJob at PythonRDD.scala:290, took 0.048348426 s {u'name': u'index', u'value': 0} >>> PySpark worker failed with exception: Traceback (most recent call last): File "/home/matt/Documents/Repositories/spark/dist/python/pyspark/worker.py", line 77, in main serializer.dump_stream(func(split_index, iterator), outfile) File "/home/matt/Documents/Repositories/spark/dist/python/pyspark/serializers.py", line 191, in dump_stream self.serializer.dump_stream(self._batched(iterator), stream) File "/home/matt/Documents/Repositories/spark/dist/python/pyspark/serializers.py", line 124, in dump_stream self._write_with_length(obj, stream) File "/home/matt/Documents/Repositories/spark/dist/python/pyspark/serializers.py", line 139, in _write_with_length stream.write(serialized) IOError: [Errno 32] Broken pipe Traceback (most recent call last): File "/home/matt/Documents/Repositories/spark/dist/python/pyspark/daemon.py", line 130, in launch_worker worker(listen_sock) File "/home/matt/Documents/Repositories/spark/dist/python/pyspark/daemon.py", line 119, in worker outfile.flush() IOError: [Errno 32] Broken pipe
Attachments
Issue Links
- links to