Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-12552

Recovered driver's resource is not counted in the Master

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: 1.6.0
    • Fix Version/s: 2.2.0, 2.3.0
    • Component/s: Deploy, Spark Core
    • Labels:
      None

      Description

      Currently in the implementation of Standalone Master HA, if application is submitted as cluster mode, the resource (CPU cores and memory) of driver is not counted again when recovered from failure, which will lead to unexpected behaviors, like more than expected executors, negative core and memory usage in the web UI. Also the recovered application's state is always WAITING, we have to change the state to RUNNING when fully recovered.

        Attachments

          Issue Links

            Activity

              People

              • Assignee:
                jerryshao Saisai Shao
                Reporter:
                jerryshao Saisai Shao
              • Votes:
                0 Vote for this issue
                Watchers:
                3 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: