Hadoop Common
  1. Hadoop Common
  2. HADOOP-5884

Capacity scheduler should account high memory jobs as using more capacity of the queue

    Details

    • Type: Bug Bug
    • Status: Resolved
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 0.20.1
    • Component/s: None
    • Labels:
      None
    • Hadoop Flags:
      Reviewed
    • Release Note:
      Hide
      Fixes Capacity scheduler to account more capacity of a queue for a high memory job. Done by considering these jobs to
      take more slots proportionally with respect to a slot's default memory size.
      Show
      Fixes Capacity scheduler to account more capacity of a queue for a high memory job. Done by considering these jobs to take more slots proportionally with respect to a slot's default memory size.

      Description

      Currently, when a high memory job is scheduled by the capacity scheduler, each task scheduled counts only once in the capacity of the queue, though it may actually be preventing other jobs from using spare slots on that node because of its higher memory requirements. In order to be fair, the capacity scheduler should proportionally (with respect to default memory) account high memory jobs as using a larger capacity of the queue.

      1. HADOOP-5884-20090529.1.txt
        45 kB
        Vinod Kumar Vavilapalli
      2. HADOOP-5884-20090602.1.txt
        64 kB
        Vinod Kumar Vavilapalli
      3. HADOOP-5884-20090603.txt
        69 kB
        Vinod Kumar Vavilapalli
      4. HADOOP-5884.patch
        66 kB
        Hemanth Yamijala
      5. HADOOP-5884-20090605-branch-20.txt
        69 kB
        Vinod Kumar Vavilapalli

        Issue Links

          Activity

          Hide
          Vinod Kumar Vavilapalli added a comment -

          The proposal is to track capacities and user-limits by the number of slots occupied by the tasks of a job instead of the number of running tasks.

          Attaching patch implementing this. This patch has to be applied over the latest patch for HADOOP-5932. This patch does the following:

          • Modifies all the calculations of capacities and user-limits to be based on the number of slots occupied by running tasks of a job.
          • Retains number of running tasks for displaying on the UI
          • Adds test-cases to verify the number of slots accounted for high memory jobs by modifying the corresponding tests.
          • Adds test-cases to verify the newly added "occupied slots" in the scheduling information
          • Adds missing @override tags, removes stale imports and stale occurrences of gc (guarenteed capacity)
          Show
          Vinod Kumar Vavilapalli added a comment - The proposal is to track capacities and user-limits by the number of slots occupied by the tasks of a job instead of the number of running tasks. Attaching patch implementing this. This patch has to be applied over the latest patch for HADOOP-5932 . This patch does the following: Modifies all the calculations of capacities and user-limits to be based on the number of slots occupied by running tasks of a job. Retains number of running tasks for displaying on the UI Adds test-cases to verify the number of slots accounted for high memory jobs by modifying the corresponding tests. Adds test-cases to verify the newly added "occupied slots" in the scheduling information Adds missing @override tags, removes stale imports and stale occurrences of gc (guarenteed capacity)
          Hide
          Vinod Kumar Vavilapalli added a comment -

          The above patch also incorporates minor modifications to test-cases as suggested in HADOOP-5934.

          Show
          Vinod Kumar Vavilapalli added a comment - The above patch also incorporates minor modifications to test-cases as suggested in HADOOP-5934 .
          Hide
          Hemanth Yamijala added a comment -

          Some comments:

          • TaskSchedulingInfo.toString() - displaying the actual value had some problem in terms of exactness and mismatch between cluster info and the state we kept. That's why we shifted to percentages. May be a good idea to retain the model. Same argument can be made for running tasks and numSlotsOccupiedByThisUser
          • "Occupied slots" seems too techie. Call it 'Used capacity' ? Likewise instead of '% of total slots occupied by all users', call it '% of used capacity' ?
          • TaskSchedulingMgr.isUserOverLimit() - we add 1 if we're using more than the queue capacity. It could be more than 1, depending on the task we are assigning (if it's part of high RAM job)
          • MapSchedulingMgr constructor: typo: schedulr - should be scheduler. Similar for Reduce...
          • Minor NIT: Use format instead of the complicated StringBuffer.append()... kind of code. Makes it really hard to find what's happening.
          • updateQSIObjects. The log statement is printing numMapSlotsForThisJob instead of numMapsRunningForThisJob.
          Show
          Hemanth Yamijala added a comment - Some comments: TaskSchedulingInfo.toString() - displaying the actual value had some problem in terms of exactness and mismatch between cluster info and the state we kept. That's why we shifted to percentages. May be a good idea to retain the model. Same argument can be made for running tasks and numSlotsOccupiedByThisUser "Occupied slots" seems too techie. Call it 'Used capacity' ? Likewise instead of '% of total slots occupied by all users', call it '% of used capacity' ? TaskSchedulingMgr.isUserOverLimit() - we add 1 if we're using more than the queue capacity. It could be more than 1, depending on the task we are assigning (if it's part of high RAM job) MapSchedulingMgr constructor: typo: schedulr - should be scheduler. Similar for Reduce... Minor NIT: Use format instead of the complicated StringBuffer.append()... kind of code. Makes it really hard to find what's happening. updateQSIObjects. The log statement is printing numMapSlotsForThisJob instead of numMapsRunningForThisJob.
          Hide
          Arun C Murthy added a comment -

          Can we also add the number of slots to the UI?

          Show
          Arun C Murthy added a comment - Can we also add the number of slots to the UI?
          Hide
          Arun C Murthy added a comment -

          Long term - we really should fix TestCapacityScheduler to not check strings and use relevant apis (even package-private ones).

          Show
          Arun C Murthy added a comment - Long term - we really should fix TestCapacityScheduler to not check strings and use relevant apis (even package-private ones).
          Hide
          Hemanth Yamijala added a comment -

          Some comments on test cases:

          • testClusterBlockingForLackOfMemory needs updates to validate the number of slots
          • I think we may need two additional tests:
            • We should have a test to check the change in sorting of queues done based on slots than running tasks. We could do this by submitting 2 jobs to 2 queues, 1 is normal and the other is high RAM. We can check that for every assignTasks call, if 1 task of the high RAM job is scheduled, two tasks of the normal job are (assuming 2 slots for a task of the high RAM Jobs).
            • We should have a check on user limits that a high RAM job hits it's user limits twice as fast as a normal job, again assuming 2 slots for a task of the high RAM job.
          Show
          Hemanth Yamijala added a comment - Some comments on test cases: testClusterBlockingForLackOfMemory needs updates to validate the number of slots I think we may need two additional tests: We should have a test to check the change in sorting of queues done based on slots than running tasks. We could do this by submitting 2 jobs to 2 queues, 1 is normal and the other is high RAM. We can check that for every assignTasks call, if 1 task of the high RAM job is scheduled, two tasks of the normal job are (assuming 2 slots for a task of the high RAM Jobs). We should have a check on user limits that a high RAM job hits it's user limits twice as fast as a normal job, again assuming 2 slots for a task of the high RAM job.
          Hide
          Vinod Kumar Vavilapalli added a comment -

          Updated patch incorporating all the above review comments except one:

          • Removed running tasks information from the UI. As of now, we are trying to avoid absolute numbers because of possible inconsistency between scheduler's information and cluster status. And, specifying running tasks as a percentage of total cluster capacity doesn't make sense now with each task possibly occupying multiple slots. The correct fix is to print absolute numbers after removing any inconsisteny possible. Hence pushing this to another follow-up jira issue.

          @Arun

          Can we also add the number of slots to the UI?

          I didn't get this. Do you mean number of slots per job being displayed in job-scheduling information? We are already displaying the number of slots used by a queue as percentage.

          If you meant the first, I already considered this, but let it go for another jira. The job scheduling information is being displayed on the jobtracker ui first page and it looked ugly when it spanned multiple lines. I think it would be good if we can remove job scheduling information from the first page. But as that might trigger discussion, I've decided to leave it for now.

          Long term - we really should fix TestCapacityScheduler to not check strings and use relevant apis (even package-private ones).

          Agree, even I could realize the pain while modifying testcases, but decide to postpone it for another jira as it is slightly tricky.

          Show
          Vinod Kumar Vavilapalli added a comment - Updated patch incorporating all the above review comments except one: Removed running tasks information from the UI. As of now, we are trying to avoid absolute numbers because of possible inconsistency between scheduler's information and cluster status. And, specifying running tasks as a percentage of total cluster capacity doesn't make sense now with each task possibly occupying multiple slots. The correct fix is to print absolute numbers after removing any inconsisteny possible. Hence pushing this to another follow-up jira issue. @Arun Can we also add the number of slots to the UI? I didn't get this. Do you mean number of slots per job being displayed in job-scheduling information? We are already displaying the number of slots used by a queue as percentage. If you meant the first, I already considered this, but let it go for another jira. The job scheduling information is being displayed on the jobtracker ui first page and it looked ugly when it spanned multiple lines. I think it would be good if we can remove job scheduling information from the first page. But as that might trigger discussion, I've decided to leave it for now. Long term - we really should fix TestCapacityScheduler to not check strings and use relevant apis (even package-private ones). Agree, even I could realize the pain while modifying testcases, but decide to postpone it for another jira as it is slightly tricky.
          Hide
          Arun C Murthy added a comment -

          I'm just proposing we add #slots (along with already available #running_tasks) to both per-queue info and per-job info (jobdetails.jsp) so that it's clear to users that the queue isn't being under-served (since #running_tasks might be lesser than #slots_taken).

          Show
          Arun C Murthy added a comment - I'm just proposing we add #slots (along with already available #running_tasks) to both per-queue info and per-job info (jobdetails.jsp) so that it's clear to users that the queue isn't being under-served (since #running_tasks might be lesser than #slots_taken).
          Hide
          Vinod Kumar Vavilapalli added a comment -

          Latest patch with Arun's comments incorporated, bringing back the absolute counts for running tasks, occupied capacities. Also added job scheduling information to the jobdetails.jsp page.

          Show
          Vinod Kumar Vavilapalli added a comment - Latest patch with Arun's comments incorporated, bringing back the absolute counts for running tasks, occupied capacities. Also added job scheduling information to the jobdetails.jsp page.
          Hide
          Hemanth Yamijala added a comment -

          A slightly modified patch. Basically just makes the comments and debug statements in test cases match code. The list of changes made are the following:

          • Added a comment on the getOrderedQueues method.
          • In testUserLimitsForHighMemoryJobs - set max reduce slots set to 2G instead of 1, as we are submitting jobs with 2G reduces.
          • Same test, JobConf was being overwritten. I changed that.
          • Also, Debug statement not matching the submitted high RAM job. (were saying 0MB reduces) Changed that.
          • Also corrected debug statements in testQueueOrdering
          Show
          Hemanth Yamijala added a comment - A slightly modified patch. Basically just makes the comments and debug statements in test cases match code. The list of changes made are the following: Added a comment on the getOrderedQueues method. In testUserLimitsForHighMemoryJobs - set max reduce slots set to 2G instead of 1, as we are submitting jobs with 2G reduces. Same test, JobConf was being overwritten. I changed that. Also, Debug statement not matching the submitted high RAM job. (were saying 0MB reduces) Changed that. Also corrected debug statements in testQueueOrdering
          Hide
          Hemanth Yamijala added a comment -

          Results of test-patch:

          [exec] +1 overall.
          [exec]
          [exec] +1 @author. The patch does not contain any @author tags.
          [exec]
          [exec] +1 tests included. The patch appears to include 4 new or modified tests.
          [exec]
          [exec] +1 javadoc. The javadoc tool did not generate any warning messages.
          [exec]
          [exec] +1 javac. The applied patch does not increase the total number of javac compiler warnings.
          [exec]
          [exec] +1 findbugs. The patch does not introduce any new Findbugs warnings.
          [exec]
          [exec] +1 Eclipse classpath. The patch retains Eclipse classpath integrity.
          [exec]
          [exec] +1 release audit. The applied patch does not increase the total number of release audit warnings.
          [exec]

          All capacity scheduler tests pass, except TestQueueCapacities which is being tracked elsewhere.

          Show
          Hemanth Yamijala added a comment - Results of test-patch: [exec] +1 overall. [exec] [exec] +1 @author. The patch does not contain any @author tags. [exec] [exec] +1 tests included. The patch appears to include 4 new or modified tests. [exec] [exec] +1 javadoc. The javadoc tool did not generate any warning messages. [exec] [exec] +1 javac. The applied patch does not increase the total number of javac compiler warnings. [exec] [exec] +1 findbugs. The patch does not introduce any new Findbugs warnings. [exec] [exec] +1 Eclipse classpath. The patch retains Eclipse classpath integrity. [exec] [exec] +1 release audit. The applied patch does not increase the total number of release audit warnings. [exec] All capacity scheduler tests pass, except TestQueueCapacities which is being tracked elsewhere.
          Hide
          Hemanth Yamijala added a comment -

          Vinod, can you please upload a patch for Hadoop 0.20, so I can commit it to the 0.20 branch as well ? I will commit both together.

          Show
          Hemanth Yamijala added a comment - Vinod, can you please upload a patch for Hadoop 0.20, so I can commit it to the 0.20 branch as well ? I will commit both together.
          Hide
          Vinod Kumar Vavilapalli added a comment -

          Patch for branch-0.20

          Show
          Vinod Kumar Vavilapalli added a comment - Patch for branch-0.20
          Hide
          Hemanth Yamijala added a comment -

          I just committed this to trunk and branch 0.20. Thanks, Vinod !

          Show
          Hemanth Yamijala added a comment - I just committed this to trunk and branch 0.20. Thanks, Vinod !
          Hide
          Hudson added a comment -
          Show
          Hudson added a comment - Integrated in Hadoop-trunk #863 (See http://hudson.zones.apache.org/hudson/job/Hadoop-trunk/863/ )

            People

            • Assignee:
              Vinod Kumar Vavilapalli
              Reporter:
              Hemanth Yamijala
            • Votes:
              0 Vote for this issue
              Watchers:
              6 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development