[SPARK-6294] PySpark task may hang while call take() on in Java/Scala - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Bug
Status: Resolved
Priority: Critical
Resolution: Fixed
Affects Version/s: 1.2.1, 1.3.0
Fix Version/s: 1.2.2, 1.3.1, 1.4.0
Component/s: PySpark
Labels:
None

Target Version/s:

1.2.2, 1.3.1, 1.4.0

Description

>>> rdd = sc.parallelize(range(1<<20)).map(lambda x: str(x))
>>> rdd._jrdd.first()

There is the stacktrace while hanging:

"Executor task launch worker-5" daemon prio=10 tid=0x00007f8fd01a9800 nid=0x566 in Object.wait() [0x00007f90481d7000]
   java.lang.Thread.State: WAITING (on object monitor)
	at java.lang.Object.wait(Native Method)
	- waiting on <0x0000000630929340> (a org.apache.spark.api.python.PythonRDD$WriterThread)
	at java.lang.Thread.join(Thread.java:1281)
	- locked <0x0000000630929340> (a org.apache.spark.api.python.PythonRDD$WriterThread)
	at java.lang.Thread.join(Thread.java:1355)
	at org.apache.spark.api.python.PythonRDD$$anonfun$compute$1.apply(PythonRDD.scala:78)
	at org.apache.spark.api.python.PythonRDD$$anonfun$compute$1.apply(PythonRDD.scala:76)
	at org.apache.spark.TaskContextImpl$$anon$1.onTaskCompletion(TaskContextImpl.scala:49)
	at org.apache.spark.TaskContextImpl$$anonfun$markTaskCompleted$1.apply(TaskContextImpl.scala:68)
	at org.apache.spark.TaskContextImpl$$anonfun$markTaskCompleted$1.apply(TaskContextImpl.scala:66)
	at scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59)
	at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:47)
	at org.apache.spark.TaskContextImpl.markTaskCompleted(TaskContextImpl.scala:66)
	at org.apache.spark.scheduler.Task.run(Task.scala:58)
	at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:196)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
	at java.lang.Thread.run(Thread.java:745)

Attachments

Issue Links

is duplicated by

SPARK-6344 Pyspark local stalls when take() before count() on cached rdd

Resolved

links to

[Github] Pull Request #4987 (davies)

[Github] Pull Request #5003 (davies)

Activity

People

Assignee:: Davies Liu

Reporter:: Davies Liu

Votes:: 0 Vote for this issue

Watchers:: 3 Start watching this issue

Dates

Created:: 11/Mar/15 23:59

Updated:: 28/May/15 03:24

Resolved:: 12/Mar/15 22:19