Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-34245

Master may not remove the finished executor when Worker fails to send ExecutorStateChanged

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • 2.4.7, 3.0.1, 3.1.1, 3.2.0
    • 3.2.0
    • Deploy, Spark Core
    • None

    Description

      If the Worker fails to send ExecutorStateChanged to the Master due to some errors, e.g., temporary network error, then the Master can't remove the finished executor normally and think the executor is still alive. In the worst case, if the executor is the only one executor for the application, the application can get hang.

       

      Attachments

        Activity

          People

            Ngone51 wuyi
            Ngone51 wuyi
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: