Uploaded image for project: 'Hadoop YARN'
  1. Hadoop YARN
  2. YARN-3453

Fair Scheduler: Parts of preemption logic uses DefaultResourceCalculator even in DRF mode causing thrashing

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • 2.6.0
    • 2.8.0, 3.0.0-alpha1
    • fairscheduler
    • None

    Description

      There are two places in preemption code flow where DefaultResourceCalculator is used, even in DRF mode.
      Which basically results in more resources getting preempted than needed, and those extra preempted containers aren’t even getting to the “starved” queue since scheduling logic is based on DRF's Calculator.

      Following are the two places :
      1.

      FSLeafQueue.java
      private boolean isStarved(Resource share)
      

      A queue shouldn’t be marked as “starved” if the dominant resource usage
      is >= fair/minshare.

      2.

      FairScheduler.java
      protected Resource resToPreempt(FSLeafQueue sched, long curTime)
      

      --------------------------------------------------------------

      One more thing that I believe needs to change in DRF mode is : during a preemption round,if preempting a few containers results in satisfying needs of a resource type, then we should exit that preemption round, since the containers that we just preempted should bring the dominant resource usage to min/fair share.

      Attachments

        1. YARN-3453.1.patch
          6 kB
          Arun Suresh
        2. YARN-3453.2.patch
          14 kB
          Arun Suresh
        3. YARN-3453.3.patch
          22 kB
          Arun Suresh
        4. YARN-3453.4.patch
          30 kB
          Arun Suresh
        5. YARN-3453.5.patch
          32 kB
          Arun Suresh

        Issue Links

          Activity

            People

              asuresh Arun Suresh
              ashwinshankar77 Ashwin Shankar
              Votes:
              0 Vote for this issue
              Watchers:
              12 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: