Hadoop Map/Reduce
  1. Hadoop Map/Reduce
  2. MAPREDUCE-2108

Allow TaskScheduler manage number slots on TaskTrackers

    Details

      Description

      Currently the map slots and reduce slots are managed by TaskTracker configuration.
      To change the task tracker slots, we need to restart the TaskTrackers.
      Also, for a non-uniform cluster, we have to deploy different sets of configuration.

      Now JobTracker holds the CPU and memory status of TaskTrackers (MAPREDUCE-1218).
      So it makes sense to just let JobTracker.taskScheduler decided the number of slots on each node.
      This way we can
      1. Change the number of slots dynamically without restarting TaskTracker
      2. Use different number of slots based on the resource of a TaskTracker

      To achieve this, we need to change the logic that we use totalMapSlots and totalReduceSlots in JobTracker.
      I think they are used in WebUI and speculativeCap.

      We will need to make JobTracker calculate these numbers from TaskScheduler and TaskTrackerStatus.
      TaskScheduler and TaskTracker can both hold their maximum slots. We pick the smaller one.

      Thoughts?

      1. MAPREDUCE-2108.txt
        5 kB
        Scott Chen
      2. MAPREDUCE-2108-v2.txt
        6 kB
        Scott Chen

        Issue Links

          Activity

          Hide
          Scott Chen added a comment -

          Arun: Thanks for the comments. You are right. I guess this is not an issue since we have MRv2. Closing this now.

          Show
          Scott Chen added a comment - Arun: Thanks for the comments. You are right. I guess this is not an issue since we have MRv2. Closing this now.
          Hide
          Arun C Murthy added a comment -

          Sorry to come in late, the patch has gone stale. Can you please rebase? Thanks.

          Given this is not an issue with MRv2 should we still commit this? I'm happy to, but not sure it's useful. Thanks.

          Show
          Arun C Murthy added a comment - Sorry to come in late, the patch has gone stale. Can you please rebase? Thanks. Given this is not an issue with MRv2 should we still commit this? I'm happy to, but not sure it's useful. Thanks.
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12464816/MAPREDUCE-2108-v2.txt
          against trunk revision 1074251.

          +1 @author. The patch does not contain any @author tags.

          -1 tests included. The patch doesn't appear to include any new or modified tests.
          Please justify why no new tests are needed for this patch.
          Also please list what manual steps were performed to verify this patch.

          +1 javadoc. The javadoc tool did not generate any warning messages.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          +1 core tests. The patch passed core unit tests.

          -1 contrib tests. The patch failed contrib unit tests.

          +1 system test framework. The patch passed system test framework compile.

          Test results: https://hudson.apache.org/hudson/job/PreCommit-MAPREDUCE-Build/63//testReport/
          Findbugs warnings: https://hudson.apache.org/hudson/job/PreCommit-MAPREDUCE-Build/63//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
          Console output: https://hudson.apache.org/hudson/job/PreCommit-MAPREDUCE-Build/63//console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12464816/MAPREDUCE-2108-v2.txt against trunk revision 1074251. +1 @author. The patch does not contain any @author tags. -1 tests included. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed core unit tests. -1 contrib tests. The patch failed contrib unit tests. +1 system test framework. The patch passed system test framework compile. Test results: https://hudson.apache.org/hudson/job/PreCommit-MAPREDUCE-Build/63//testReport/ Findbugs warnings: https://hudson.apache.org/hudson/job/PreCommit-MAPREDUCE-Build/63//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Console output: https://hudson.apache.org/hudson/job/PreCommit-MAPREDUCE-Build/63//console This message is automatically generated.
          Hide
          Scott Chen added a comment -

          Update. Replace the maxSlots in machines.jsp.

          Show
          Scott Chen added a comment - Update. Replace the maxSlots in machines.jsp.
          Hide
          Scott Chen added a comment -

          Can you please describe the changes you are making here...

          Sorry for not making this clear. The purpose here is to move the control of maximum slots to TaskScheduler.
          This allows TaskScheduler to perform better resource scheduling and allows changing the number of slots on fly.

          The changes made in the patch are the following:
          1. Add a getMaxSlots(TaskTrackerStatus, TaskType) method to TaskScheduler.
          2. Replace TaskTrackerStatus.getMaxSlots() everywhere in the JobTracker with the above method.

          This way the JobTracker pulls the "maximum slots" information from TaskScheduler.
          The default method in TaskScheduler.getMaxSlots() is to simply report TaskTrackerStatus.getMaxSlots(). So it will not change any behavior.
          But people can overwrite this method and put more sophisticated logic in it (See MAPREDUCE-2198).

          Show
          Scott Chen added a comment - Can you please describe the changes you are making here... Sorry for not making this clear. The purpose here is to move the control of maximum slots to TaskScheduler. This allows TaskScheduler to perform better resource scheduling and allows changing the number of slots on fly. The changes made in the patch are the following: 1. Add a getMaxSlots(TaskTrackerStatus, TaskType) method to TaskScheduler. 2. Replace TaskTrackerStatus.getMaxSlots() everywhere in the JobTracker with the above method. This way the JobTracker pulls the "maximum slots" information from TaskScheduler. The default method in TaskScheduler.getMaxSlots() is to simply report TaskTrackerStatus.getMaxSlots(). So it will not change any behavior. But people can overwrite this method and put more sophisticated logic in it (See MAPREDUCE-2198 ).
          Hide
          Arun C Murthy added a comment -

          Can you please describe the changes you are making here...

          Show
          Arun C Murthy added a comment - Can you please describe the changes you are making here...
          Hide
          Scott Chen added a comment -

          Hey Arun,

          Thanks. I have made the first patch.
          Like you mentioned, the slot may be used in may places and we need to have a careful check.
          I will read the code more carefully. I have a feeling that there are still things needs to be changed.

          At least in this patch will not change the behavior if we didn't do anything to the scheduler.

          Scott

          Show
          Scott Chen added a comment - Hey Arun, Thanks. I have made the first patch. Like you mentioned, the slot may be used in may places and we need to have a careful check. I will read the code more carefully. I have a feeling that there are still things needs to be changed. At least in this patch will not change the behavior if we didn't do anything to the scheduler. Scott
          Hide
          Arun C Murthy added a comment -

          I like the direction.

          I'll warn you that the current implementation of the JT has the notion of 'slot' burned in very deeply into various parts of the codebase, not to mention the TT itself. You might need to do a complete rewrite/refactor of the JT/TT to pull this off. My 2c. Good luck.

          Show
          Arun C Murthy added a comment - I like the direction. I'll warn you that the current implementation of the JT has the notion of 'slot' burned in very deeply into various parts of the codebase, not to mention the TT itself. You might need to do a complete rewrite/refactor of the JT/TT to pull this off. My 2c. Good luck.

            People

            • Assignee:
              Scott Chen
              Reporter:
              Scott Chen
            • Votes:
              0 Vote for this issue
              Watchers:
              14 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development