[SPARK-1284] pyspark hangs after IOError on Executor - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Bug
Status: Resolved
Priority: Major
Resolution: Fixed
Affects Version/s: None
Fix Version/s: 1.1.0
Component/s: PySpark
Labels:
None

Description

When running a reduceByKey over a cached RDD, Python fails with an exception, but the failure is not detected by the task runner. Spark and the pyspark shell hang waiting for the task to finish.

The error is:

PySpark worker failed with exception:
Traceback (most recent call last):
  File "/home/hadoop/spark/python/pyspark/worker.py", line 77, in main
    serializer.dump_stream(func(split_index, iterator), outfile)
  File "/home/hadoop/spark/python/pyspark/serializers.py", line 182, in dump_stream
    self.serializer.dump_stream(self._batched(iterator), stream)
  File "/home/hadoop/spark/python/pyspark/serializers.py", line 118, in dump_stream
    self._write_with_length(obj, stream)
  File "/home/hadoop/spark/python/pyspark/serializers.py", line 130, in _write_with_length
    stream.write(serialized)
IOError: [Errno 104] Connection reset by peer

14/03/19 22:48:15 INFO scheduler.TaskSetManager: Serialized task 4.0:0 as 4257 bytes in 47 ms
Traceback (most recent call last):
  File "/home/hadoop/spark/python/pyspark/daemon.py", line 117, in launch_worker
    worker(listen_sock)
  File "/home/hadoop/spark/python/pyspark/daemon.py", line 107, in worker
    outfile.flush()
IOError: [Errno 32] Broken pipe

I can reproduce the error by running take(10) on the cached RDD before running reduceByKey (which looks at the whole input file).

Affects Version 1.0.0-SNAPSHOT (4d88030486)

Attachments

Activity

People

Assignee:: Davies Liu

Reporter:: Jim Blomo

Votes:: 0 Vote for this issue

Watchers:: 4 Start watching this issue

Dates

Created:: 19/Mar/14 16:16

Updated:: 02/Oct/14 21:23

Resolved:: 02/Oct/14 21:23