Uploaded image for project: 'Hadoop YARN'
  1. Hadoop YARN
  2. YARN-4690

Skip object allocation in FSAppAttempt#getResourceUsage when possible

    Details

    • Type: Improvement
    • Status: Closed
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 2.8.0, 2.7.3, 2.6.5, 3.0.0-alpha1
    • Component/s: None
    • Labels:
      None
    • Hadoop Flags:
      Reviewed

      Description

      YARN-2768 addresses an important bottleneck. Here is another similar instance where object allocation in Resources#subtract will slow down the fair scheduler's event processing thread.

      org.apache.hadoop.yarn.factories.impl.pb.RecordFactoryPBImpl.newRecordInstance(RecordFactoryPBImpl.java)
          org.apache.hadoop.yarn.util.Records.newRecord(Records.java)
          org.apache.hadoop.yarn.util.resource.Resources.createResource(Resources.java)
          org.apache.hadoop.yarn.util.resource.Resources.clone(Resources.java)
          org.apache.hadoop.yarn.util.resource.Resources.subtract(Resources.java)
          org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSAppAttempt.getResourceUsage(FSAppAttempt.java)
          org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSLeafQueue.getResourceUsage(FSLeafQueue.java)
          org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.policies.FairSharePolicy$FairShareComparator.compare(FairSharePolicy.java)
          org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.policies.FairSharePolicy$FairShareComparator.compare(FairSharePolicy.java)
          java.util.TimSort.binarySort(TimSort.java)
          java.util.TimSort.sort(TimSort.java)
          java.util.TimSort.sort(TimSort.java)
          java.util.Arrays.sort(Arrays.java)
          java.util.Collections.sort(Collections.java)
          org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSParentQueue.assignContainer(FSParentQueue.java)
          org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.attemptScheduling(FairScheduler.java)
          org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.nodeUpdate(FairScheduler.java)
          org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.handle(FairScheduler.java)
          org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.handle(FairScheduler.java)
          org.apache.hadoop.yarn.sls.scheduler.ResourceSchedulerWrapper.handle(ResourceSchedulerWrapper.java)
          org.apache.hadoop.yarn.sls.scheduler.ResourceSchedulerWrapper.handle(ResourceSchedulerWrapper.java)
      

      One way to fix it is to return getCurrentConsumption() if there is no preemption which is the normal case. This means getResourceUsage method will return reference to FSAppAttempt's internal resource object. But that should be ok as getResourceUsage doesn't expect the caller to modify the object.

        Activity

        Hide
        vinodkv Vinod Kumar Vavilapalli added a comment -

        Closing the JIRA as part of 2.7.3 release.

        Show
        vinodkv Vinod Kumar Vavilapalli added a comment - Closing the JIRA as part of 2.7.3 release.
        Hide
        mingma Ming Ma added a comment -
        Show
        mingma Ming Ma added a comment - Thanks Sangjin Lee and Karthik Kambatla !
        Hide
        hudson Hudson added a comment -

        FAILURE: Integrated in Hadoop-trunk-Commit #9319 (See https://builds.apache.org/job/Hadoop-trunk-Commit/9319/)
        YARN-4690. Skip object allocation in FSAppAttempt#getResourceUsage when (sjlee: rev 7de70680fe44967e2afc92ba4c92f8e7afa7b151)

        • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/FSAppAttempt.java
        • hadoop-yarn-project/CHANGES.txt
        Show
        hudson Hudson added a comment - FAILURE: Integrated in Hadoop-trunk-Commit #9319 (See https://builds.apache.org/job/Hadoop-trunk-Commit/9319/ ) YARN-4690 . Skip object allocation in FSAppAttempt#getResourceUsage when (sjlee: rev 7de70680fe44967e2afc92ba4c92f8e7afa7b151) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/FSAppAttempt.java hadoop-yarn-project/CHANGES.txt
        Hide
        sjlee0 Sangjin Lee added a comment -

        Committed the patch to branch-2.6, branch-2.7, branch-2.8, branch-2, and trunk. Thanks Ming Ma for your contribution, and Karthik Kambatla for your review!

        Show
        sjlee0 Sangjin Lee added a comment - Committed the patch to branch-2.6, branch-2.7, branch-2.8, branch-2, and trunk. Thanks Ming Ma for your contribution, and Karthik Kambatla for your review!
        Hide
        sjlee0 Sangjin Lee added a comment -

        Sounds good. Hadn't checked what getPreemptedResources() does.

        I'm also +1. I'll commit it shortly.

        Show
        sjlee0 Sangjin Lee added a comment - Sounds good. Hadn't checked what getPreemptedResources() does. I'm also +1. I'll commit it shortly.
        Hide
        kasha Karthik Kambatla added a comment -

        The optimization might not help since the scope of the cached variable is limited to getResourceUsage() which does nothing else. Also, getPremptedResources() isn't heavy, it is just the accessor for preemptedResources.

        I am +1 on the patch posted here. Sangjin Lee - if you are fine with it, feel free to go ahead and commit it.

        Show
        kasha Karthik Kambatla added a comment - The optimization might not help since the scope of the cached variable is limited to getResourceUsage() which does nothing else. Also, getPremptedResources() isn't heavy, it is just the accessor for preemptedResources . I am +1 on the patch posted here. Sangjin Lee - if you are fine with it, feel free to go ahead and commit it.
        Hide
        sjlee0 Sangjin Lee added a comment -

        It looks good to me for the most part. I have only one minor comment.

        For the case where getPreemptedResources() is non-zero, we might want to cache that as a small optimization. In other words,

        Resource preemptedResource = getPreemptedResources();
        return preemptedResources.equals(Resources.none()) ?
            getCurrentConsumption() :
            Resources.subtract(getCurrentConsumption(), preemptedResources());
        
        Show
        sjlee0 Sangjin Lee added a comment - It looks good to me for the most part. I have only one minor comment. For the case where getPreemptedResources() is non-zero, we might want to cache that as a small optimization. In other words, Resource preemptedResource = getPreemptedResources(); return preemptedResources.equals(Resources.none()) ? getCurrentConsumption() : Resources.subtract(getCurrentConsumption(), preemptedResources());
        Hide
        mingma Ming Ma added a comment -

        The failed tests pass locally and aren't related.

        Show
        mingma Ming Ma added a comment - The failed tests pass locally and aren't related.
        Hide
        hadoopqa Hadoop QA added a comment -
        -1 overall



        Vote Subsystem Runtime Comment
        0 reexec 0m 14s Docker mode activated.
        +1 @author 0m 0s The patch does not contain any @author tags.
        -1 test4tests 0m 0s The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch.
        +1 mvninstall 6m 40s trunk passed
        +1 compile 0m 27s trunk passed with JDK v1.8.0_72
        +1 compile 0m 29s trunk passed with JDK v1.7.0_95
        +1 checkstyle 0m 17s trunk passed
        +1 mvnsite 0m 34s trunk passed
        +1 mvneclipse 0m 15s trunk passed
        +1 findbugs 1m 2s trunk passed
        +1 javadoc 0m 20s trunk passed with JDK v1.8.0_72
        +1 javadoc 0m 26s trunk passed with JDK v1.7.0_95
        +1 mvninstall 0m 30s the patch passed
        +1 compile 0m 23s the patch passed with JDK v1.8.0_72
        +1 javac 0m 23s the patch passed
        +1 compile 0m 27s the patch passed with JDK v1.7.0_95
        +1 javac 0m 27s the patch passed
        +1 checkstyle 0m 15s the patch passed
        +1 mvnsite 0m 32s the patch passed
        +1 mvneclipse 0m 12s the patch passed
        +1 whitespace 0m 0s Patch has no whitespace issues.
        +1 findbugs 1m 14s the patch passed
        +1 javadoc 0m 19s the patch passed with JDK v1.8.0_72
        +1 javadoc 0m 23s the patch passed with JDK v1.7.0_95
        -1 unit 70m 43s hadoop-yarn-server-resourcemanager in the patch failed with JDK v1.8.0_72.
        -1 unit 74m 39s hadoop-yarn-server-resourcemanager in the patch failed with JDK v1.7.0_95.
        +1 asflicense 0m 17s Patch does not generate ASF License warnings.
        161m 40s



        Reason Tests
        JDK v1.8.0_72 Failed junit tests hadoop.yarn.server.resourcemanager.TestClientRMTokens
          hadoop.yarn.server.resourcemanager.TestAMAuthorization
          hadoop.yarn.server.resourcemanager.scheduler.fair.TestFairScheduler
        JDK v1.7.0_95 Failed junit tests hadoop.yarn.server.resourcemanager.TestClientRMTokens
          hadoop.yarn.server.resourcemanager.TestAMAuthorization



        Subsystem Report/Notes
        Docker Image:yetus/hadoop:0ca8df7
        JIRA Patch URL https://issues.apache.org/jira/secure/attachment/12787689/YARN-4690.patch
        JIRA Issue YARN-4690
        Optional Tests asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle
        uname Linux fe88dc389179 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux
        Build tool maven
        Personality /testptch/hadoop/patchprocess/precommit/personality/provided.sh
        git revision trunk / e6a7044
        Default Java 1.7.0_95
        Multi-JDK versions /usr/lib/jvm/java-8-oracle:1.8.0_72 /usr/lib/jvm/java-7-openjdk-amd64:1.7.0_95
        findbugs v3.0.0
        unit https://builds.apache.org/job/PreCommit-YARN-Build/10563/artifact/patchprocess/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager-jdk1.8.0_72.txt
        unit https://builds.apache.org/job/PreCommit-YARN-Build/10563/artifact/patchprocess/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager-jdk1.7.0_95.txt
        unit test logs https://builds.apache.org/job/PreCommit-YARN-Build/10563/artifact/patchprocess/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager-jdk1.8.0_72.txt https://builds.apache.org/job/PreCommit-YARN-Build/10563/artifact/patchprocess/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager-jdk1.7.0_95.txt
        JDK v1.7.0_95 Test Results https://builds.apache.org/job/PreCommit-YARN-Build/10563/testReport/
        modules C: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager U: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager
        Console output https://builds.apache.org/job/PreCommit-YARN-Build/10563/console
        Powered by Apache Yetus 0.2.0-SNAPSHOT http://yetus.apache.org

        This message was automatically generated.

        Show
        hadoopqa Hadoop QA added a comment - -1 overall Vote Subsystem Runtime Comment 0 reexec 0m 14s Docker mode activated. +1 @author 0m 0s The patch does not contain any @author tags. -1 test4tests 0m 0s The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. +1 mvninstall 6m 40s trunk passed +1 compile 0m 27s trunk passed with JDK v1.8.0_72 +1 compile 0m 29s trunk passed with JDK v1.7.0_95 +1 checkstyle 0m 17s trunk passed +1 mvnsite 0m 34s trunk passed +1 mvneclipse 0m 15s trunk passed +1 findbugs 1m 2s trunk passed +1 javadoc 0m 20s trunk passed with JDK v1.8.0_72 +1 javadoc 0m 26s trunk passed with JDK v1.7.0_95 +1 mvninstall 0m 30s the patch passed +1 compile 0m 23s the patch passed with JDK v1.8.0_72 +1 javac 0m 23s the patch passed +1 compile 0m 27s the patch passed with JDK v1.7.0_95 +1 javac 0m 27s the patch passed +1 checkstyle 0m 15s the patch passed +1 mvnsite 0m 32s the patch passed +1 mvneclipse 0m 12s the patch passed +1 whitespace 0m 0s Patch has no whitespace issues. +1 findbugs 1m 14s the patch passed +1 javadoc 0m 19s the patch passed with JDK v1.8.0_72 +1 javadoc 0m 23s the patch passed with JDK v1.7.0_95 -1 unit 70m 43s hadoop-yarn-server-resourcemanager in the patch failed with JDK v1.8.0_72. -1 unit 74m 39s hadoop-yarn-server-resourcemanager in the patch failed with JDK v1.7.0_95. +1 asflicense 0m 17s Patch does not generate ASF License warnings. 161m 40s Reason Tests JDK v1.8.0_72 Failed junit tests hadoop.yarn.server.resourcemanager.TestClientRMTokens   hadoop.yarn.server.resourcemanager.TestAMAuthorization   hadoop.yarn.server.resourcemanager.scheduler.fair.TestFairScheduler JDK v1.7.0_95 Failed junit tests hadoop.yarn.server.resourcemanager.TestClientRMTokens   hadoop.yarn.server.resourcemanager.TestAMAuthorization Subsystem Report/Notes Docker Image:yetus/hadoop:0ca8df7 JIRA Patch URL https://issues.apache.org/jira/secure/attachment/12787689/YARN-4690.patch JIRA Issue YARN-4690 Optional Tests asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle uname Linux fe88dc389179 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux Build tool maven Personality /testptch/hadoop/patchprocess/precommit/personality/provided.sh git revision trunk / e6a7044 Default Java 1.7.0_95 Multi-JDK versions /usr/lib/jvm/java-8-oracle:1.8.0_72 /usr/lib/jvm/java-7-openjdk-amd64:1.7.0_95 findbugs v3.0.0 unit https://builds.apache.org/job/PreCommit-YARN-Build/10563/artifact/patchprocess/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager-jdk1.8.0_72.txt unit https://builds.apache.org/job/PreCommit-YARN-Build/10563/artifact/patchprocess/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager-jdk1.7.0_95.txt unit test logs https://builds.apache.org/job/PreCommit-YARN-Build/10563/artifact/patchprocess/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager-jdk1.8.0_72.txt https://builds.apache.org/job/PreCommit-YARN-Build/10563/artifact/patchprocess/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager-jdk1.7.0_95.txt JDK v1.7.0_95 Test Results https://builds.apache.org/job/PreCommit-YARN-Build/10563/testReport/ modules C: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager U: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager Console output https://builds.apache.org/job/PreCommit-YARN-Build/10563/console Powered by Apache Yetus 0.2.0-SNAPSHOT http://yetus.apache.org This message was automatically generated.
        Hide
        mingma Ming Ma added a comment -

        Here is the draft patch. After we fix YARN-4691, this jira becomes less important. But this is still a good fix to have.

        For the perf evaluation, the fair scheduler container allocation throughput is around 65k/min without the fix; around 133k/min with the fix.

        Show
        mingma Ming Ma added a comment - Here is the draft patch. After we fix YARN-4691 , this jira becomes less important. But this is still a good fix to have. For the perf evaluation, the fair scheduler container allocation throughput is around 65k/min without the fix; around 133k/min with the fix.

          People

          • Assignee:
            mingma Ming Ma
            Reporter:
            mingma Ming Ma
          • Votes:
            0 Vote for this issue
            Watchers:
            7 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development