Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-1686

Master switches thread when ElectedLeader

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • 0.9.0, 1.0.0
    • None
    • Spark Core
    • None

    Description

      In deploy.master.Master, the completeRecovery method is the last thing to be called when a standalone Master is recovering from failure. It is responsible for resetting some state, relaunching drivers, and eventually resuming its scheduling duties.

      There are currently four places in Master.scala where completeRecovery is called. Three of them are from within the actor's receive method, and aren't problems. The last starts from within receive when the ElectedLeader message is received, but the actual completeRecovery() call is made from the Akka scheduler. That means that it will execute on a different scheduler thread, and Master itself will end up running (i.e., schedule() ) from that Akka scheduler thread. Among other things, that means that uncaught exception handling will be different – https://issues.apache.org/jira/browse/SPARK-1620

      Attachments

        Activity

          People

            codingcat Nan Zhu
            markhamstra Mark Hamstra
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: