Hadoop Map/Reduce
  1. Hadoop Map/Reduce
  2. MAPREDUCE-1684

ClusterStatus can be cached in CapacityTaskScheduler.assignTasks()

    Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: 1.0.2
    • Fix Version/s: 1.2.0
    • Component/s: capacity-sched
    • Labels:
      None
    • Target Version/s:

      Description

      Currently, CapacityTaskScheduler.assignTasks() calls getClusterStatus() thrice: once in assignTasks(), once in MapTaskScheduler and once in ReduceTaskScheduler. It can be cached in assignTasks() and re-used.

        Issue Links

          Activity

          Hide
          Kang Xiao added a comment -

          What's the reason to cache ClusterStatus, performance improvement?

          Show
          Kang Xiao added a comment - What's the reason to cache ClusterStatus, performance improvement?
          Hide
          Vinod Kumar Vavilapalli added a comment -

          What's the reason to cache ClusterStatus, performance improvement?

          Yes. Currently obtaining the ClusterStatus object needs the petrifying global JT lock.

          Show
          Vinod Kumar Vavilapalli added a comment - What's the reason to cache ClusterStatus, performance improvement? Yes. Currently obtaining the ClusterStatus object needs the petrifying global JT lock.
          Hide
          Koji Noguchi added a comment -

          Currently, CapacityTaskScheduler.assignTasks() calls getClusterStatus() thrice

          I think it calls getClusterStatus calls #jobs times in the worst case.

          For each heartbeat from TaskTracker with some slots available,

          heartbeat --> assignTasks 
                    --> addMap/ReduceTasks
                    --> TaskSchedulingMgr.assignTasks
                        --> For each queue : queuesForAssigningTasks)
                            --> getTaskFromQueue(queue)
                                --> For each j : queue.getRunningJobs()
                                    --> obtainNewTask --> **getClusterStatus**
          

          It can be cached in assignTasks() and re-used.

          Attaching a patch. Would this work?

          Motivation is, we see getClusterStatus way too often in our jstack holding the global lock.

          "IPC Server handler 15 on 50300" daemon prio=10 tid=0x000000005fc5d800 nid=0x6828 runnable [0x0000000044847000]
             java.lang.Thread.State: RUNNABLE
            at org.apache.hadoop.mapred.JobTracker.getClusterStatus(JobTracker.java:4065)
            - locked <0x00002aab6e638bd8> (a org.apache.hadoop.mapred.JobTracker)
            at org.apache.hadoop.mapred.CapacityTaskScheduler$MapSchedulingMgr.obtainNewTask(CapacityTaskScheduler.java:503)
            at org.apache.hadoop.mapred.CapacityTaskScheduler$TaskSchedulingMgr.getTaskFromQueue(CapacityTaskScheduler.java:322)
            at org.apache.hadoop.mapred.CapacityTaskScheduler$TaskSchedulingMgr.assignTasks(CapacityTaskScheduler.java:419)
            at org.apache.hadoop.mapred.CapacityTaskScheduler$TaskSchedulingMgr.access$500(CapacityTaskScheduler.java:150)
            at org.apache.hadoop.mapred.CapacityTaskScheduler.addMapTasks(CapacityTaskScheduler.java:1075)
            at org.apache.hadoop.mapred.CapacityTaskScheduler.assignTasks(CapacityTaskScheduler.java:1044)
            - locked <0x00002aab6e7ffb10> (a org.apache.hadoop.mapred.CapacityTaskScheduler)
            at org.apache.hadoop.mapred.JobTracker.heartbeat(JobTracker.java:3398)
            - locked <0x00002aab6e638bd8> (a org.apache.hadoop.mapred.JobTracker)
            at sun.reflect.GeneratedMethodAccessor24.invoke(Unknown Source)
            at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
            at java.lang.reflect.Method.invoke(Method.java:597)
            at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:563)
            at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1388)
            at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1384)
            at java.security.AccessController.doPrivileged(Native Method)
            at javax.security.auth.Subject.doAs(Subject.java:396)
            at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1093)
            at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1382)
          
          Show
          Koji Noguchi added a comment - Currently, CapacityTaskScheduler.assignTasks() calls getClusterStatus() thrice I think it calls getClusterStatus calls #jobs times in the worst case. For each heartbeat from TaskTracker with some slots available, heartbeat --> assignTasks --> addMap/ReduceTasks --> TaskSchedulingMgr.assignTasks --> For each queue : queuesForAssigningTasks) --> getTaskFromQueue(queue) --> For each j : queue.getRunningJobs() --> obtainNewTask --> **getClusterStatus** It can be cached in assignTasks() and re-used. Attaching a patch. Would this work? Motivation is, we see getClusterStatus way too often in our jstack holding the global lock. "IPC Server handler 15 on 50300" daemon prio=10 tid=0x000000005fc5d800 nid=0x6828 runnable [0x0000000044847000] java.lang.Thread.State: RUNNABLE at org.apache.hadoop.mapred.JobTracker.getClusterStatus(JobTracker.java:4065) - locked <0x00002aab6e638bd8> (a org.apache.hadoop.mapred.JobTracker) at org.apache.hadoop.mapred.CapacityTaskScheduler$MapSchedulingMgr.obtainNewTask(CapacityTaskScheduler.java:503) at org.apache.hadoop.mapred.CapacityTaskScheduler$TaskSchedulingMgr.getTaskFromQueue(CapacityTaskScheduler.java:322) at org.apache.hadoop.mapred.CapacityTaskScheduler$TaskSchedulingMgr.assignTasks(CapacityTaskScheduler.java:419) at org.apache.hadoop.mapred.CapacityTaskScheduler$TaskSchedulingMgr.access$500(CapacityTaskScheduler.java:150) at org.apache.hadoop.mapred.CapacityTaskScheduler.addMapTasks(CapacityTaskScheduler.java:1075) at org.apache.hadoop.mapred.CapacityTaskScheduler.assignTasks(CapacityTaskScheduler.java:1044) - locked <0x00002aab6e7ffb10> (a org.apache.hadoop.mapred.CapacityTaskScheduler) at org.apache.hadoop.mapred.JobTracker.heartbeat(JobTracker.java:3398) - locked <0x00002aab6e638bd8> (a org.apache.hadoop.mapred.JobTracker) at sun.reflect.GeneratedMethodAccessor24.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:563) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1388) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1384) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1093) at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1382)
          Hide
          Devaraj Das added a comment -

          Looks good to me.

          Show
          Devaraj Das added a comment - Looks good to me.
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12541077/mapreduce-1684-v1.0.2-1.patch
          against trunk revision .

          -1 patch. The patch command could not apply the patch.

          Console output: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/2740//console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12541077/mapreduce-1684-v1.0.2-1.patch against trunk revision . -1 patch. The patch command could not apply the patch. Console output: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/2740//console This message is automatically generated.
          Hide
          Thomas Graves added a comment -

          +1 looks good to me.

          test-patch output. The find bugs exist without this patch.

          [exec] -1 overall.
          [exec]
          [exec] +1 @author. The patch does not contain any @author tags.
          [exec]
          [exec] -1 tests included. The patch doesn't appear to include any new or modified tests.
          [exec] Please justify why no tests are needed for this patch.
          [exec]
          [exec] +1 javadoc. The javadoc tool did not generate any warning messages.
          [exec]
          [exec] +1 javac. The applied patch does not increase the total number of javac compiler warnings.
          [exec]
          [exec] -1 findbugs. The patch appears to introduce 8 new Findbugs (version 1.3.9) warnings.
          [exec]

          Show
          Thomas Graves added a comment - +1 looks good to me. test-patch output. The find bugs exist without this patch. [exec] -1 overall. [exec] [exec] +1 @author. The patch does not contain any @author tags. [exec] [exec] -1 tests included. The patch doesn't appear to include any new or modified tests. [exec] Please justify why no tests are needed for this patch. [exec] [exec] +1 javadoc. The javadoc tool did not generate any warning messages. [exec] [exec] +1 javac. The applied patch does not increase the total number of javac compiler warnings. [exec] [exec] -1 findbugs. The patch appears to introduce 8 new Findbugs (version 1.3.9) warnings. [exec]
          Hide
          Matt Foley added a comment -

          Closed upon release of Hadoop 1.2.0.

          Show
          Matt Foley added a comment - Closed upon release of Hadoop 1.2.0.

            People

            • Assignee:
              Koji Noguchi
              Reporter:
              Amareshwari Sriramadasu
            • Votes:
              0 Vote for this issue
              Watchers:
              11 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development