Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-39742

Request executor after kill executor, the number of executors is not as expected

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Not A Bug
    • 3.2.1
    • None
    • Scheduler
    • None

    Description

      I used the killExecutors and requestExecutors function of SparkContext to dynamically adjust the resources, and found that the requestExecutors after killExecutors could not achieve the expected results.

      Add unit tests in StandaloneDynamicAllocationSuite.scala 

      test("kill executors first and then request") {
          sc = new SparkContext(appConf
            .set(config.EXECUTOR_CORES, 2)
            .set(config.CORES_MAX, 8))
          val appId = sc.applicationId
          eventually(timeout(10.seconds), interval(10.millis)) {
            val apps = getApplications()
            assert(apps.size === 1)
            assert(apps.head.id === appId)
            assert(apps.head.executors.size === 4) // 8 cores total
            assert(apps.head.getExecutorLimit === Int.MaxValue)
          }
          // sync executors between the Master and the driver, needed because
          // the driver refuses to kill executors it does not know about
          syncExecutors(sc)
          val executors = getExecutorIds(sc)
          assert(executors.size === 4)
          // kill 2 executors
          assert(sc.killExecutors(executors.take(3)))
          val apps = getApplications()
          assert(apps.head.executors.size === 1)
          // add 2 executors
          assert(sc.requestExecutors(3))
          assert(apps.head.executors.size === 4)
        } 

      3 did not equal 4
      Expected :4
      Actual   :3

      Attachments

        Activity

          People

            Unassigned Unassigned
            zhuml Mingliang Zhu
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: