Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-12552

Recovered driver's resource is not counted in the Master

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • 1.6.0
    • 2.2.0, 2.3.0
    • Deploy, Spark Core
    • None

    Description

      Currently in the implementation of Standalone Master HA, if application is submitted as cluster mode, the resource (CPU cores and memory) of driver is not counted again when recovered from failure, which will lead to unexpected behaviors, like more than expected executors, negative core and memory usage in the web UI. Also the recovered application's state is always WAITING, we have to change the state to RUNNING when fully recovered.

      Attachments

        Issue Links

          Activity

            People

              jerryshao Saisai Shao
              jerryshao Saisai Shao
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: