Hadoop Map/Reduce
  1. Hadoop Map/Reduce
  2. MAPREDUCE-3789

CapacityTaskScheduler may perform unnecessary reservations in heterogenous tracker environments

    Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Critical Critical
    • Resolution: Fixed
    • Affects Version/s: 1.1.0
    • Fix Version/s: 1.1.0
    • Component/s: capacity-sched, scheduler
    • Labels:
      None
    • Hadoop Flags:
      Reviewed

      Description

      Briefly, to reproduce:

      • Run JT with CapacityTaskScheduler [Say, Cluster max map = 8G, Cluster map = 2G]
      • Run two TTs but with varied capacity, say, one with 4 map slot, another with 3 map slots.
      • Run a job with two tasks, each demanding mem worth 4 slots at least (Map mem = 7G or so).
      • Job will begin running on TT #1, but will also end up reserving the 3 slots on TT #2 cause it does not check for the maximum limit of slots when reserving (as it goes greedy, and hopes to gain more slots in future).
      • Other jobs that could've run on the TT #2 over 3 slots are thereby blocked out due to this illogical reservation.

      I've not yet tested MR2 for this so feel free to weigh in if it affects MR2 as well.

      For MR1, I've attached a test case initially to indicate this. A fix that checks reservations vs. max slots, to follow.

      1. MAPREDUCE-3789.patch
        4 kB
        Harsh J
      2. MAPREDUCE-3789.patch
        6 kB
        Harsh J
      3. MAPREDUCE-3789.patch
        6 kB
        Harsh J

        Issue Links

          Activity

          Hide
          Harsh J added a comment -

          Patch with testcase alone to illustrate this bug.

          Show
          Harsh J added a comment - Patch with testcase alone to illustrate this bug.
          Hide
          Harsh J added a comment -

          Test log for this scenario:

          12/02/02 15:30:10 INFO mapred.CapacityTaskScheduler: Initializing 'default' queue with cap=100.0, maxCap=-1.0, ulMin=100, ulMinFactor=1.0, supportsPriorities=true, maxJobsToInit=3000, maxJobsToAccept=30000, maxActiveTasks=200000, maxJobsPerUserToInit=3000, maxJobsPerUserToAccept=30000, maxActiveTasksPerUser=100000
          12/02/02 15:30:10 INFO mapred.CapacityTaskScheduler: Scheduler configured with (memSizeForMapSlotOnJT, memSizeForReduceSlotOnJT, limitMaxMemForMapTasks, limitMaxMemForReduceTasks) (2048,2048,8192,8192)
          12/02/02 15:30:10 INFO mapred.CapacityTaskScheduler: Added new queue: default
          12/02/02 15:30:10 INFO mapred.CapacityTaskScheduler: Capacity scheduler initialized 1 queues
          12/02/02 15:30:10 INFO delegation.AbstractDelegationTokenSecretManager: Updating the current master key for generating delegation tokens
          12/02/02 15:30:10 INFO mapred.JobTracker: Scheduler configured with (memSizeForMapSlotOnJT, memSizeForReduceSlotOnJT, limitMaxMemForMapTasks, limitMaxMemForReduceTasks) (-1, -1, -1, -1)
          12/02/02 15:30:10 INFO util.HostsFileReader: Refreshing hosts (include/exclude) list
          12/02/02 15:30:10 INFO delegation.AbstractDelegationTokenSecretManager: Starting expired delegation token remover thread, tokenRemoverScanInterval=60 min(s)
          12/02/02 15:30:10 INFO delegation.AbstractDelegationTokenSecretManager: Updating the current master key for generating delegation tokens
          12/02/02 15:30:10 INFO mapred.JobTracker: Starting jobtracker with owner as harshchouraria
          12/02/02 15:30:10 INFO ipc.Server: Starting SocketReader
          12/02/02 15:30:10 INFO http.HttpServer: Added global filtersafety (class=org.apache.hadoop.http.HttpServer$QuotingInputFilter)
          12/02/02 15:30:10 INFO http.HttpServer: Port returned by webServer.getConnectors()[0].getLocalPort() before open() is -1. Opening the listener on 0
          12/02/02 15:30:10 INFO http.HttpServer: listener.getLocalPort() returned 55111 webServer.getConnectors()[0].getLocalPort() returned 55111
          12/02/02 15:30:10 INFO http.HttpServer: Jetty bound to port 55111
          2012-02-02 15:30:10.216:INFO::jetty-6.1.26
          2012-02-02 15:30:10.361:INFO::Started SelectChannelConnector@0.0.0.0:55111
          12/02/02 15:30:10 INFO mapred.JobTracker: JobTracker up at: 55110
          12/02/02 15:30:10 INFO mapred.JobTracker: JobTracker webserver: 55111
          12/02/02 15:30:10 INFO mapred.JobTracker: Cleaning up the system directory
          12/02/02 15:30:10 INFO mapred.JobTracker: History server being initialized in embedded mode
          12/02/02 15:30:10 INFO mapred.JobHistoryServer: Started job history server at: localhost:55111
          12/02/02 15:30:10 INFO mapred.JobTracker: Job History Server web address: localhost:55111
          12/02/02 15:30:10 INFO mapred.CompletedJobStatusStore: Completed job store is inactive
          12/02/02 15:30:10 INFO mapred.JobInProgress: job_test_0001: nMaps=2 nReduces=0 max=-1
          12/02/02 15:30:10 INFO mapred.JobQueuesManager: Job job_test_0001 submitted to queue default
          12/02/02 15:30:10 INFO mapred.CapacityTaskScheduler: job_test_0001: Reserving tt2 since memory-requirements don't match
          *12/02/02 15:30:10 INFO jobtracker.TaskTracker: tt2: Reserved 3 MAP slots for job_test_0001*
          12/02/02 15:30:10 INFO mapred.TestCapacityScheduler: 1 running map tasks using 4 map slots. 4 additional slots reserved. 0 running reduce tasks using 0 reduce slots. 0 additional slots reserved.
          
          Show
          Harsh J added a comment - Test log for this scenario: 12/02/02 15:30:10 INFO mapred.CapacityTaskScheduler: Initializing ' default ' queue with cap=100.0, maxCap=-1.0, ulMin=100, ulMinFactor=1.0, supportsPriorities= true , maxJobsToInit=3000, maxJobsToAccept=30000, maxActiveTasks=200000, maxJobsPerUserToInit=3000, maxJobsPerUserToAccept=30000, maxActiveTasksPerUser=100000 12/02/02 15:30:10 INFO mapred.CapacityTaskScheduler: Scheduler configured with (memSizeForMapSlotOnJT, memSizeForReduceSlotOnJT, limitMaxMemForMapTasks, limitMaxMemForReduceTasks) (2048,2048,8192,8192) 12/02/02 15:30:10 INFO mapred.CapacityTaskScheduler: Added new queue: default 12/02/02 15:30:10 INFO mapred.CapacityTaskScheduler: Capacity scheduler initialized 1 queues 12/02/02 15:30:10 INFO delegation.AbstractDelegationTokenSecretManager: Updating the current master key for generating delegation tokens 12/02/02 15:30:10 INFO mapred.JobTracker: Scheduler configured with (memSizeForMapSlotOnJT, memSizeForReduceSlotOnJT, limitMaxMemForMapTasks, limitMaxMemForReduceTasks) (-1, -1, -1, -1) 12/02/02 15:30:10 INFO util.HostsFileReader: Refreshing hosts (include/exclude) list 12/02/02 15:30:10 INFO delegation.AbstractDelegationTokenSecretManager: Starting expired delegation token remover thread, tokenRemoverScanInterval=60 min(s) 12/02/02 15:30:10 INFO delegation.AbstractDelegationTokenSecretManager: Updating the current master key for generating delegation tokens 12/02/02 15:30:10 INFO mapred.JobTracker: Starting jobtracker with owner as harshchouraria 12/02/02 15:30:10 INFO ipc.Server: Starting SocketReader 12/02/02 15:30:10 INFO http.HttpServer: Added global filtersafety (class=org.apache.hadoop.http.HttpServer$QuotingInputFilter) 12/02/02 15:30:10 INFO http.HttpServer: Port returned by webServer.getConnectors()[0].getLocalPort() before open() is -1. Opening the listener on 0 12/02/02 15:30:10 INFO http.HttpServer: listener.getLocalPort() returned 55111 webServer.getConnectors()[0].getLocalPort() returned 55111 12/02/02 15:30:10 INFO http.HttpServer: Jetty bound to port 55111 2012-02-02 15:30:10.216:INFO::jetty-6.1.26 2012-02-02 15:30:10.361:INFO::Started SelectChannelConnector@0.0.0.0:55111 12/02/02 15:30:10 INFO mapred.JobTracker: JobTracker up at: 55110 12/02/02 15:30:10 INFO mapred.JobTracker: JobTracker webserver: 55111 12/02/02 15:30:10 INFO mapred.JobTracker: Cleaning up the system directory 12/02/02 15:30:10 INFO mapred.JobTracker: History server being initialized in embedded mode 12/02/02 15:30:10 INFO mapred.JobHistoryServer: Started job history server at: localhost:55111 12/02/02 15:30:10 INFO mapred.JobTracker: Job History Server web address: localhost:55111 12/02/02 15:30:10 INFO mapred.CompletedJobStatusStore: Completed job store is inactive 12/02/02 15:30:10 INFO mapred.JobInProgress: job_test_0001: nMaps=2 nReduces=0 max=-1 12/02/02 15:30:10 INFO mapred.JobQueuesManager: Job job_test_0001 submitted to queue default 12/02/02 15:30:10 INFO mapred.CapacityTaskScheduler: job_test_0001: Reserving tt2 since memory-requirements don't match *12/02/02 15:30:10 INFO jobtracker.TaskTracker: tt2: Reserved 3 MAP slots for job_test_0001* 12/02/02 15:30:10 INFO mapred.TestCapacityScheduler: 1 running map tasks using 4 map slots. 4 additional slots reserved. 0 running reduce tasks using 0 reduce slots. 0 additional slots reserved.
          Hide
          Harsh J added a comment -

          Patch with a possible fix and extended testcase.

          ant test for other tests pass:

              [junit] Running org.apache.hadoop.mapred.TestCapacityScheduler
              [junit] Tests run: 35, Failures: 0, Errors: 0, Time elapsed: 62.769 sec
              [junit] Running org.apache.hadoop.mapred.TestCapacitySchedulerConf
              [junit] Tests run: 9, Failures: 0, Errors: 0, Time elapsed: 0.666 sec
              [junit] Running org.apache.hadoop.mapred.TestCapacitySchedulerServlet
              [junit] Tests run: 1, Failures: 0, Errors: 0, Time elapsed: 11.041 sec
              [junit] Running org.apache.hadoop.mapred.TestCapacitySchedulerWithJobTracker
              [junit] Tests run: 2, Failures: 0, Errors: 0, Time elapsed: 108.964 sec
              [junit] Running org.apache.hadoop.mapred.TestJobTrackerRestartWithCS
              [junit] Tests run: 1, Failures: 0, Errors: 0, Time elapsed: 27.251 sec
          
          Show
          Harsh J added a comment - Patch with a possible fix and extended testcase. ant test for other tests pass: [junit] Running org.apache.hadoop.mapred.TestCapacityScheduler [junit] Tests run: 35, Failures: 0, Errors: 0, Time elapsed: 62.769 sec [junit] Running org.apache.hadoop.mapred.TestCapacitySchedulerConf [junit] Tests run: 9, Failures: 0, Errors: 0, Time elapsed: 0.666 sec [junit] Running org.apache.hadoop.mapred.TestCapacitySchedulerServlet [junit] Tests run: 1, Failures: 0, Errors: 0, Time elapsed: 11.041 sec [junit] Running org.apache.hadoop.mapred.TestCapacitySchedulerWithJobTracker [junit] Tests run: 2, Failures: 0, Errors: 0, Time elapsed: 108.964 sec [junit] Running org.apache.hadoop.mapred.TestJobTrackerRestartWithCS [junit] Tests run: 1, Failures: 0, Errors: 0, Time elapsed: 27.251 sec
          Hide
          Harsh J added a comment -

          Please review.

          Show
          Harsh J added a comment - Please review.
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12512938/MAPREDUCE-3789.patch
          against trunk revision .

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 3 new or modified tests.

          -1 patch. The patch command could not apply the patch.

          Console output: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/1746//console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12512938/MAPREDUCE-3789.patch against trunk revision . +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 3 new or modified tests. -1 patch. The patch command could not apply the patch. Console output: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/1746//console This message is automatically generated.
          Hide
          Alejandro Abdelnur added a comment -

          Patch looks good, tests pass. one minor comment, the method

            private static int getTTMaxSlotsForType(TaskTrackerStatus status, TaskType type) {
              if (type == TaskType.MAP) {
                return status.getMaxMapSlots();
              }
              return status.getMaxReduceSlots();
            }
          

          could be refactored into

            private static int getTTMaxSlotsForType(TaskTrackerStatus status, TaskType type) {
              return (type == TaskType.MAP) ? status.getMaxMapSlots() : status.getMaxReduceSlots();
            }
          

          IMO it would be cleaner.

          Have you tested this in a cluster?

          Show
          Alejandro Abdelnur added a comment - Patch looks good, tests pass. one minor comment, the method private static int getTTMaxSlotsForType(TaskTrackerStatus status, TaskType type) { if (type == TaskType.MAP) { return status.getMaxMapSlots(); } return status.getMaxReduceSlots(); } could be refactored into private static int getTTMaxSlotsForType(TaskTrackerStatus status, TaskType type) { return (type == TaskType.MAP) ? status.getMaxMapSlots() : status.getMaxReduceSlots(); } IMO it would be cleaner. Have you tested this in a cluster?
          Hide
          Harsh J added a comment -

          Alejandro - Yes I ran the steps for reproduce on a live cluster as well, and with the fix in place the low-slot requirement job runs, while without it the slots are soaked up illogically by the other high mem one.

          Updated patch with your suggested refactors.

          Show
          Harsh J added a comment - Alejandro - Yes I ran the steps for reproduce on a live cluster as well, and with the fix in place the low-slot requirement job runs, while without it the slots are soaked up illogically by the other high mem one. Updated patch with your suggested refactors.
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12514086/MAPREDUCE-3789.patch
          against trunk revision .

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 3 new or modified tests.

          -1 patch. The patch command could not apply the patch.

          Console output: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/1833//console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12514086/MAPREDUCE-3789.patch against trunk revision . +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 3 new or modified tests. -1 patch. The patch command could not apply the patch. Console output: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/1833//console This message is automatically generated.
          Hide
          Alejandro Abdelnur added a comment -

          +1

          Show
          Alejandro Abdelnur added a comment - +1
          Hide
          Harsh J added a comment -

          Thanks tucu, committed to branch-1.

          When I get time later, I will try the same on YARN and file a new JIRA for that if the bug still exists with its CS.

          Show
          Harsh J added a comment - Thanks tucu, committed to branch-1. When I get time later, I will try the same on YARN and file a new JIRA for that if the bug still exists with its CS.
          Hide
          Matt Foley added a comment -

          Closed upon release of Hadoop-1.1.0.

          Show
          Matt Foley added a comment - Closed upon release of Hadoop-1.1.0.

            People

            • Assignee:
              Harsh J
              Reporter:
              Harsh J
            • Votes:
              0 Vote for this issue
              Watchers:
              7 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development