Details
-
Bug
-
Status: Resolved
-
Major
-
Resolution: Fixed
-
2.1.3, 2.2.3, 2.3.0, 2.3.1, 2.3.2, 2.4.0
-
None
Description
During an investigation of OOM of one internal production job, I found that PipedRDD leaks memory. After some digging, the problem lies down to the fact that PipedRDD doesn't release stdin writer and stdout threads even if the task is finished.
PipedRDD creates two threads: stdin writer and stdout reader. If we are lucky and the task is finished normally, these two threads exit normally. If the subprocess(pipe command) is failed, the task will be marked failed, however the stdin writer will be still running until it consumes its parent RDD's iterator. There is even a race condition with ShuffledRDD + PipedRDD: the ShuffleBlockFetchIterator is cleaned up at task completion and hangs stdin writer thread, which leaks memory.