Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-41848

Tasks are over-scheduled with TaskResourceProfile

Attach filesAttach ScreenshotVotersWatch issueWatchersCreate sub-taskLinkCloneUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Blocker
    • Resolution: Fixed
    • 3.4.0
    • 3.4.0
    • Spark Core
    • None

    Description

      test("SPARK-XXX") {
        val conf = new SparkConf().setAppName("test").setMaster("local-cluster[1,4,1024]")
        sc = new SparkContext(conf)
        val req = new TaskResourceRequests().cpus(3)
        val rp = new ResourceProfileBuilder().require(req).build()
      
        val res = sc.parallelize(Seq(0, 1), 2).withResources(rp).map { x =>
          Thread.sleep(5000)
          x * 2
        }.collect()
        assert(res === Array(0, 2))
      } 

      In this test, tasks are supposed to be scheduled in order since each task requires 3 cores but the executor only has 4 cores. However, we noticed 2 tasks are launched concurrently from the logs.

      It turns out that we used the TaskResourceProfile (taskCpus=3) of the taskset for task scheduling:

      val rpId = taskSet.taskSet.resourceProfileId
      val taskSetProf = sc.resourceProfileManager.resourceProfileFromId(rpId)
      val taskCpus = ResourceProfile.getTaskCpusOrDefaultForProfile(taskSetProf, conf) 

      but the ResourceProfile (taskCpus=1) of the executor for updating the free cores in ExecutorData:

      val rpId = executorData.resourceProfileId
      val prof = scheduler.sc.resourceProfileManager.resourceProfileFromId(rpId)
      val taskCpus = ResourceProfile.getTaskCpusOrDefaultForProfile(prof, conf)
      executorData.freeCores -= taskCpus 

      which results in the inconsistency of the available cores.

      Attachments

        Issue Links

        Activity

          This comment will be Viewable by All Users Viewable by All Users
          Cancel

          People

            ivoson Tengfei Huang
            Ngone51 wuyi
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Slack

                Issue deployment