Uploaded image for project: 'Hadoop YARN'
  1. Hadoop YARN
  2. YARN-3453

Fair Scheduler: Parts of preemption logic uses DefaultResourceCalculator even in DRF mode causing thrashing

VotersWatch issueWatchersCreate sub-taskLinkCloneUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • 2.6.0
    • 2.8.0, 3.0.0-alpha1
    • fairscheduler
    • None

    Description

      There are two places in preemption code flow where DefaultResourceCalculator is used, even in DRF mode.
      Which basically results in more resources getting preempted than needed, and those extra preempted containers aren’t even getting to the “starved” queue since scheduling logic is based on DRF's Calculator.

      Following are the two places :
      1.

      FSLeafQueue.java
      private boolean isStarved(Resource share)
      

      A queue shouldn’t be marked as “starved” if the dominant resource usage
      is >= fair/minshare.

      2.

      FairScheduler.java
      protected Resource resToPreempt(FSLeafQueue sched, long curTime)
      

      --------------------------------------------------------------

      One more thing that I believe needs to change in DRF mode is : during a preemption round,if preempting a few containers results in satisfying needs of a resource type, then we should exit that preemption round, since the containers that we just preempted should bring the dominant resource usage to min/fair share.

      Attachments

        1. YARN-3453.1.patch
          6 kB
          Arun Suresh
        2. YARN-3453.2.patch
          14 kB
          Arun Suresh
        3. YARN-3453.3.patch
          22 kB
          Arun Suresh
        4. YARN-3453.4.patch
          30 kB
          Arun Suresh
        5. YARN-3453.5.patch
          32 kB
          Arun Suresh

        Issue Links

        Activity

          This comment will be Viewable by All Users Viewable by All Users
          Cancel

          People

            asuresh Arun Suresh
            ashwinshankar77 Ashwin Shankar
            Votes:
            0 Vote for this issue
            Watchers:
            12 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Slack

                Issue deployment