Affects Version/s: 1.1.1, 1.2.0, 1.3.0
Component/s: Spark Core
Java version "1.7.0_51"
Java(TM) SE Runtime Environment (build 1.7.0_51-b13)
Java HotSpot(TM) 64-Bit Server VM (build 24.51-b03, mixed mode)
Mac OSX 10.10.1
Using local spark context
The attached test class runs two identical jobs that perform some iterative computation on an RDD[(Int, Int)]. This computation involves
- taking new data merging it with the previous result
- caching and checkpointing the new result
- rinse and repeat
The first time the job is run, it runs successfully, and the spark context is shut down. The second time the job is run with a new spark context in the same process, the job hangs indefinitely, only having scheduled a subset of the necessary tasks for the final stage.
Ive been able to produce a test case that reproduces the issue, and I've added some comments where some knockout experimentation has left some breadcrumbs as to where the issue might be.