Details
-
Bug
-
Status: Resolved
-
Blocker
-
Resolution: Fixed
-
0.9.0, 1.0.0
-
None
Description
I've been looking at this one in the context of a BlockManagerMaster that OOMs and doesn't respond to heartBeat(), but I suspect that there may be problems elsewhere where we use Akka's scheduler.
The basic nature of the problem is that we are expecting exceptions thrown from a scheduled function to be caught in the thread where ActorSystem.scheduler.schedule() or scheduleOnce() has been called. In fact, the scheduled function runs on its own thread, so any exceptions that it throws are not caught in the thread that called schedule() – e.g., unanswered BlockManager heartBeats (scheduled in BlockManager#initialize) that end up throwing exceptions in BlockManagerMaster#askDriverWithReply do not cause those exceptions to be handled by the Executor thread's UncaughtExceptionHandler.