[SPARK-1686] Master switches thread when ElectedLeader - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Bug
Status: Resolved
Priority: Major
Resolution: Fixed
Affects Version/s: 0.9.0, 1.0.0
Fix Version/s: None
Component/s: Spark Core
Labels:
None

Description

In deploy.master.Master, the completeRecovery method is the last thing to be called when a standalone Master is recovering from failure. It is responsible for resetting some state, relaunching drivers, and eventually resuming its scheduling duties.

There are currently four places in Master.scala where completeRecovery is called. Three of them are from within the actor's receive method, and aren't problems. The last starts from within receive when the ElectedLeader message is received, but the actual completeRecovery() call is made from the Akka scheduler. That means that it will execute on a different scheduler thread, and Master itself will end up running (i.e., schedule() ) from that Akka scheduler thread. Among other things, that means that uncaught exception handling will be different – https://issues.apache.org/jira/browse/SPARK-1620

Attachments

Activity

People

Assignee:: Nan Zhu

Reporter:: Mark Hamstra

Votes:: 0 Vote for this issue

Watchers:: 2 Start watching this issue

Dates

Created:: 30/Apr/14 21:02

Updated:: 10/May/14 04:53

Resolved:: 10/May/14 04:53