Uploaded image for project: 'Hadoop YARN'
  1. Hadoop YARN
  2. YARN-5774

MR Job stuck in ACCEPTED status without any progress in Fair Scheduler if set yarn.scheduler.minimum-allocation-mb to 0.

    Details

      Description

      MR Job stuck in ACCEPTED status without any progress in Fair Scheduler because there is no resource request for the AM. This happened when you configure yarn.scheduler.minimum-allocation-mb to zero.

      The problem is in the code used by both Capacity Scheduler and Fair Scheduler. scheduler.increment-allocation-mb is a concept in FS, but not CS. So the common code in class RMAppManager passes the yarn.scheduler.minimum-allocation-mb as incremental one because there is no incremental one for CS when it tried to normalize the resource requests.

           SchedulerUtils.normalizeRequest(amReq, scheduler.getResourceCalculator(),
                scheduler.getClusterResource(),
                scheduler.getMinimumResourceCapability(),
                scheduler.getMaximumResourceCapability(),
                scheduler.getMinimumResourceCapability());  --> incrementResource should be passed here.
      
      1. YARN-5774.001.patch
        16 kB
        Yufei Gu
      2. YARN-5774.002.patch
        17 kB
        Yufei Gu
      3. YARN-5774.003.patch
        17 kB
        Yufei Gu
      4. YARN-5774.004.patch
        23 kB
        Yufei Gu
      5. YARN-5774.005.patch
        34 kB
        Yufei Gu
      6. YARN-5774.006.patch
        36 kB
        Yufei Gu
      7. YARN-5774.007.patch
        38 kB
        Yufei Gu

        Activity

        Hide
        hadoopqa Hadoop QA added a comment -
        -1 overall



        Vote Subsystem Runtime Comment
        0 reexec 0m 19s Docker mode activated.
        +1 @author 0m 0s The patch does not contain any @author tags.
        +1 test4tests 0m 0s The patch appears to include 1 new or modified test files.
        +1 mvninstall 6m 44s trunk passed
        +1 compile 0m 32s trunk passed
        +1 checkstyle 0m 24s trunk passed
        +1 mvnsite 0m 38s trunk passed
        +1 mvneclipse 0m 16s trunk passed
        +1 findbugs 0m 57s trunk passed
        +1 javadoc 0m 21s trunk passed
        +1 mvninstall 0m 30s the patch passed
        +1 compile 0m 30s the patch passed
        +1 javac 0m 30s the patch passed
        -1 checkstyle 0m 21s hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager: The patch generated 1 new + 295 unchanged - 9 fixed = 296 total (was 304)
        +1 mvnsite 0m 35s the patch passed
        +1 mvneclipse 0m 13s the patch passed
        +1 whitespace 0m 0s The patch has no whitespace issues.
        +1 findbugs 1m 2s the patch passed
        +1 javadoc 0m 18s hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager generated 0 new + 925 unchanged - 13 fixed = 925 total (was 938)
        +1 unit 39m 8s hadoop-yarn-server-resourcemanager in the patch passed.
        +1 asflicense 0m 15s The patch does not generate ASF License warnings.
        53m 42s



        Subsystem Report/Notes
        Docker Image:yetus/hadoop:9560f25
        JIRA Patch URL https://issues.apache.org/jira/secure/attachment/12835082/YARN-5774.001.patch
        JIRA Issue YARN-5774
        Optional Tests asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle
        uname Linux 0d31235e371b 3.13.0-95-generic #142-Ubuntu SMP Fri Aug 12 17:00:09 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux
        Build tool maven
        Personality /testptch/hadoop/patchprocess/precommit/personality/provided.sh
        git revision trunk / dbd2057
        Default Java 1.8.0_101
        findbugs v3.0.0
        checkstyle https://builds.apache.org/job/PreCommit-YARN-Build/13497/artifact/patchprocess/diff-checkstyle-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager.txt
        Test Results https://builds.apache.org/job/PreCommit-YARN-Build/13497/testReport/
        modules C: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager U: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager
        Console output https://builds.apache.org/job/PreCommit-YARN-Build/13497/console
        Powered by Apache Yetus 0.3.0 http://yetus.apache.org

        This message was automatically generated.

        Show
        hadoopqa Hadoop QA added a comment - -1 overall Vote Subsystem Runtime Comment 0 reexec 0m 19s Docker mode activated. +1 @author 0m 0s The patch does not contain any @author tags. +1 test4tests 0m 0s The patch appears to include 1 new or modified test files. +1 mvninstall 6m 44s trunk passed +1 compile 0m 32s trunk passed +1 checkstyle 0m 24s trunk passed +1 mvnsite 0m 38s trunk passed +1 mvneclipse 0m 16s trunk passed +1 findbugs 0m 57s trunk passed +1 javadoc 0m 21s trunk passed +1 mvninstall 0m 30s the patch passed +1 compile 0m 30s the patch passed +1 javac 0m 30s the patch passed -1 checkstyle 0m 21s hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager: The patch generated 1 new + 295 unchanged - 9 fixed = 296 total (was 304) +1 mvnsite 0m 35s the patch passed +1 mvneclipse 0m 13s the patch passed +1 whitespace 0m 0s The patch has no whitespace issues. +1 findbugs 1m 2s the patch passed +1 javadoc 0m 18s hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager generated 0 new + 925 unchanged - 13 fixed = 925 total (was 938) +1 unit 39m 8s hadoop-yarn-server-resourcemanager in the patch passed. +1 asflicense 0m 15s The patch does not generate ASF License warnings. 53m 42s Subsystem Report/Notes Docker Image:yetus/hadoop:9560f25 JIRA Patch URL https://issues.apache.org/jira/secure/attachment/12835082/YARN-5774.001.patch JIRA Issue YARN-5774 Optional Tests asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle uname Linux 0d31235e371b 3.13.0-95-generic #142-Ubuntu SMP Fri Aug 12 17:00:09 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux Build tool maven Personality /testptch/hadoop/patchprocess/precommit/personality/provided.sh git revision trunk / dbd2057 Default Java 1.8.0_101 findbugs v3.0.0 checkstyle https://builds.apache.org/job/PreCommit-YARN-Build/13497/artifact/patchprocess/diff-checkstyle-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager.txt Test Results https://builds.apache.org/job/PreCommit-YARN-Build/13497/testReport/ modules C: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager U: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager Console output https://builds.apache.org/job/PreCommit-YARN-Build/13497/console Powered by Apache Yetus 0.3.0 http://yetus.apache.org This message was automatically generated.
        Hide
        miklos.szegedi@cloudera.com Miklos Szegedi added a comment -

        Hi Yufei Gu!

        I am wondering, if the normalize methods for DefaultResourceCalculator and DominantResourceCalculator are better place for this check. They cover all other possible callers. Also, it might make sense to throw, if min > max.

            if (Resources.equals(incrementResource, Resources.none())) {
              throw new RuntimeException("Increment resource cannot be zero!");
            }
        
        Show
        miklos.szegedi@cloudera.com Miklos Szegedi added a comment - Hi Yufei Gu ! I am wondering, if the normalize methods for DefaultResourceCalculator and DominantResourceCalculator are better place for this check. They cover all other possible callers. Also, it might make sense to throw, if min > max. if (Resources.equals(incrementResource, Resources.none())) { throw new RuntimeException( "Increment resource cannot be zero!" ); }
        Hide
        yufeigu Yufei Gu added a comment -

        Miklos Szegedi, thanks for the review. I uploaded patch 002 for your comments. For the min > max thing, each scheduler will do the sanity check while initialize itself. You can find it in function initScheduler -> validateConf.

        Show
        yufeigu Yufei Gu added a comment - Miklos Szegedi , thanks for the review. I uploaded patch 002 for your comments. For the min > max thing, each scheduler will do the sanity check while initialize itself. You can find it in function initScheduler -> validateConf .
        Hide
        hadoopqa Hadoop QA added a comment -
        -1 overall



        Vote Subsystem Runtime Comment
        0 reexec 0m 16s Docker mode activated.
        +1 @author 0m 0s The patch does not contain any @author tags.
        -1 test4tests 0m 0s The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch.
        0 mvndep 0m 10s Maven dependency ordering for branch
        +1 mvninstall 6m 43s trunk passed
        +1 compile 2m 21s trunk passed
        +1 checkstyle 0m 41s trunk passed
        +1 mvnsite 1m 7s trunk passed
        +1 mvneclipse 0m 30s trunk passed
        +1 findbugs 1m 51s trunk passed
        +1 javadoc 0m 49s trunk passed
        0 mvndep 0m 10s Maven dependency ordering for patch
        +1 mvninstall 0m 57s the patch passed
        +1 compile 2m 18s the patch passed
        +1 javac 2m 18s the patch passed
        +1 checkstyle 0m 41s hadoop-yarn-project/hadoop-yarn: The patch generated 0 new + 299 unchanged - 9 fixed = 299 total (was 308)
        +1 mvnsite 1m 5s the patch passed
        +1 mvneclipse 0m 28s the patch passed
        +1 whitespace 0m 0s The patch has no whitespace issues.
        +1 findbugs 2m 5s the patch passed
        +1 javadoc 0m 27s hadoop-yarn-common in the patch passed.
        +1 javadoc 0m 21s hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager generated 0 new + 927 unchanged - 11 fixed = 927 total (was 938)
        +1 unit 2m 17s hadoop-yarn-common in the patch passed.
        +1 unit 38m 34s hadoop-yarn-server-resourcemanager in the patch passed.
        +1 asflicense 0m 19s The patch does not generate ASF License warnings.
        71m 52s



        Subsystem Report/Notes
        Docker Image:yetus/hadoop:9560f25
        JIRA Issue YARN-5774
        JIRA Patch URL https://issues.apache.org/jira/secure/attachment/12836235/YARN-5774.002.patch
        Optional Tests asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle
        uname Linux f9073096be1a 3.13.0-95-generic #142-Ubuntu SMP Fri Aug 12 17:00:09 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux
        Build tool maven
        Personality /testptch/hadoop/patchprocess/precommit/personality/provided.sh
        git revision trunk / 773c60b
        Default Java 1.8.0_101
        findbugs v3.0.0
        Test Results https://builds.apache.org/job/PreCommit-YARN-Build/13708/testReport/
        modules C: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager U: hadoop-yarn-project/hadoop-yarn
        Console output https://builds.apache.org/job/PreCommit-YARN-Build/13708/console
        Powered by Apache Yetus 0.4.0-SNAPSHOT http://yetus.apache.org

        This message was automatically generated.

        Show
        hadoopqa Hadoop QA added a comment - -1 overall Vote Subsystem Runtime Comment 0 reexec 0m 16s Docker mode activated. +1 @author 0m 0s The patch does not contain any @author tags. -1 test4tests 0m 0s The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. 0 mvndep 0m 10s Maven dependency ordering for branch +1 mvninstall 6m 43s trunk passed +1 compile 2m 21s trunk passed +1 checkstyle 0m 41s trunk passed +1 mvnsite 1m 7s trunk passed +1 mvneclipse 0m 30s trunk passed +1 findbugs 1m 51s trunk passed +1 javadoc 0m 49s trunk passed 0 mvndep 0m 10s Maven dependency ordering for patch +1 mvninstall 0m 57s the patch passed +1 compile 2m 18s the patch passed +1 javac 2m 18s the patch passed +1 checkstyle 0m 41s hadoop-yarn-project/hadoop-yarn: The patch generated 0 new + 299 unchanged - 9 fixed = 299 total (was 308) +1 mvnsite 1m 5s the patch passed +1 mvneclipse 0m 28s the patch passed +1 whitespace 0m 0s The patch has no whitespace issues. +1 findbugs 2m 5s the patch passed +1 javadoc 0m 27s hadoop-yarn-common in the patch passed. +1 javadoc 0m 21s hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager generated 0 new + 927 unchanged - 11 fixed = 927 total (was 938) +1 unit 2m 17s hadoop-yarn-common in the patch passed. +1 unit 38m 34s hadoop-yarn-server-resourcemanager in the patch passed. +1 asflicense 0m 19s The patch does not generate ASF License warnings. 71m 52s Subsystem Report/Notes Docker Image:yetus/hadoop:9560f25 JIRA Issue YARN-5774 JIRA Patch URL https://issues.apache.org/jira/secure/attachment/12836235/YARN-5774.002.patch Optional Tests asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle uname Linux f9073096be1a 3.13.0-95-generic #142-Ubuntu SMP Fri Aug 12 17:00:09 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux Build tool maven Personality /testptch/hadoop/patchprocess/precommit/personality/provided.sh git revision trunk / 773c60b Default Java 1.8.0_101 findbugs v3.0.0 Test Results https://builds.apache.org/job/PreCommit-YARN-Build/13708/testReport/ modules C: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager U: hadoop-yarn-project/hadoop-yarn Console output https://builds.apache.org/job/PreCommit-YARN-Build/13708/console Powered by Apache Yetus 0.4.0-SNAPSHOT http://yetus.apache.org This message was automatically generated.
        Hide
        miklos.szegedi@cloudera.com Miklos Szegedi added a comment -

        Thank you, Yufei Gu!

        I am wondering, if we really need to compare stepFactor to Resources.none() here rather than stepFactor.getMemorySize() to 0. It is an edge case but what if stepFactor.getGetVirtualCores() is non-zero but stepFactor.getMemorySize() is 0? We avoid throwing an exception in this case.

          public Resource normalize(Resource r, Resource minimumResource,
              Resource maximumResource, Resource stepFactor) {
            if (Resources.equals(stepFactor, Resources.none())) {
              throw new YarnRuntimeException("StepFactor resource cannot be zero!");
            }
        
            long normalizedMemory = Math.min(
                roundUp(
                    Math.max(r.getMemorySize(), minimumResource.getMemorySize()),
                    stepFactor.getMemorySize()),
                    maximumResource.getMemorySize());
            return Resources.createResource(normalizedMemory);
          }
        
        Show
        miklos.szegedi@cloudera.com Miklos Szegedi added a comment - Thank you, Yufei Gu ! I am wondering, if we really need to compare stepFactor to Resources.none() here rather than stepFactor.getMemorySize() to 0. It is an edge case but what if stepFactor.getGetVirtualCores() is non-zero but stepFactor.getMemorySize() is 0? We avoid throwing an exception in this case. public Resource normalize(Resource r, Resource minimumResource, Resource maximumResource, Resource stepFactor) { if (Resources.equals(stepFactor, Resources.none())) { throw new YarnRuntimeException( "StepFactor resource cannot be zero!" ); } long normalizedMemory = Math .min( roundUp( Math .max(r.getMemorySize(), minimumResource.getMemorySize()), stepFactor.getMemorySize()), maximumResource.getMemorySize()); return Resources.createResource(normalizedMemory); }
        Hide
        yufeigu Yufei Gu added a comment -

        Nice catch! Thanks Miklos Szegedi. I uploaded patch 003 for your comment.

        Show
        yufeigu Yufei Gu added a comment - Nice catch! Thanks Miklos Szegedi . I uploaded patch 003 for your comment.
        Hide
        miklos.szegedi@cloudera.com Miklos Szegedi added a comment -

        +1 (non-binding) The change looks good to me. Thank you, Yufei Gu!

        Show
        miklos.szegedi@cloudera.com Miklos Szegedi added a comment - +1 (non-binding) The change looks good to me. Thank you, Yufei Gu !
        Hide
        hadoopqa Hadoop QA added a comment -
        -1 overall



        Vote Subsystem Runtime Comment
        0 reexec 0m 14s Docker mode activated.
        +1 @author 0m 0s The patch does not contain any @author tags.
        -1 test4tests 0m 0s The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch.
        0 mvndep 0m 9s Maven dependency ordering for branch
        +1 mvninstall 6m 54s trunk passed
        +1 compile 2m 17s trunk passed
        +1 checkstyle 0m 42s trunk passed
        +1 mvnsite 1m 7s trunk passed
        +1 mvneclipse 0m 29s trunk passed
        +1 findbugs 1m 51s trunk passed
        +1 javadoc 0m 48s trunk passed
        0 mvndep 0m 10s Maven dependency ordering for patch
        +1 mvninstall 0m 56s the patch passed
        +1 compile 2m 14s the patch passed
        +1 javac 2m 14s the patch passed
        +1 checkstyle 0m 40s hadoop-yarn-project/hadoop-yarn: The patch generated 0 new + 299 unchanged - 9 fixed = 299 total (was 308)
        +1 mvnsite 1m 4s the patch passed
        +1 mvneclipse 0m 28s the patch passed
        +1 whitespace 0m 0s The patch has no whitespace issues.
        +1 findbugs 2m 5s the patch passed
        +1 javadoc 0m 25s hadoop-yarn-common in the patch passed.
        +1 javadoc 0m 20s hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager generated 0 new + 926 unchanged - 11 fixed = 926 total (was 937)
        +1 unit 2m 17s hadoop-yarn-common in the patch passed.
        -1 unit 35m 3s hadoop-yarn-server-resourcemanager in the patch failed.
        +1 asflicense 0m 18s The patch does not generate ASF License warnings.
        68m 17s



        Reason Tests
        Failed junit tests hadoop.yarn.server.resourcemanager.applicationsmanager.TestAMRestart



        Subsystem Report/Notes
        Docker Image:yetus/hadoop:9560f25
        JIRA Issue YARN-5774
        JIRA Patch URL https://issues.apache.org/jira/secure/attachment/12836466/YARN-5774.003.patch
        Optional Tests asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle
        uname Linux 36e3974b0136 3.13.0-93-generic #140-Ubuntu SMP Mon Jul 18 21:21:05 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux
        Build tool maven
        Personality /testptch/hadoop/patchprocess/precommit/personality/provided.sh
        git revision trunk / aacf214
        Default Java 1.8.0_101
        findbugs v3.0.0
        unit https://builds.apache.org/job/PreCommit-YARN-Build/13743/artifact/patchprocess/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager.txt
        Test Results https://builds.apache.org/job/PreCommit-YARN-Build/13743/testReport/
        modules C: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager U: hadoop-yarn-project/hadoop-yarn
        Console output https://builds.apache.org/job/PreCommit-YARN-Build/13743/console
        Powered by Apache Yetus 0.4.0-SNAPSHOT http://yetus.apache.org

        This message was automatically generated.

        Show
        hadoopqa Hadoop QA added a comment - -1 overall Vote Subsystem Runtime Comment 0 reexec 0m 14s Docker mode activated. +1 @author 0m 0s The patch does not contain any @author tags. -1 test4tests 0m 0s The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. 0 mvndep 0m 9s Maven dependency ordering for branch +1 mvninstall 6m 54s trunk passed +1 compile 2m 17s trunk passed +1 checkstyle 0m 42s trunk passed +1 mvnsite 1m 7s trunk passed +1 mvneclipse 0m 29s trunk passed +1 findbugs 1m 51s trunk passed +1 javadoc 0m 48s trunk passed 0 mvndep 0m 10s Maven dependency ordering for patch +1 mvninstall 0m 56s the patch passed +1 compile 2m 14s the patch passed +1 javac 2m 14s the patch passed +1 checkstyle 0m 40s hadoop-yarn-project/hadoop-yarn: The patch generated 0 new + 299 unchanged - 9 fixed = 299 total (was 308) +1 mvnsite 1m 4s the patch passed +1 mvneclipse 0m 28s the patch passed +1 whitespace 0m 0s The patch has no whitespace issues. +1 findbugs 2m 5s the patch passed +1 javadoc 0m 25s hadoop-yarn-common in the patch passed. +1 javadoc 0m 20s hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager generated 0 new + 926 unchanged - 11 fixed = 926 total (was 937) +1 unit 2m 17s hadoop-yarn-common in the patch passed. -1 unit 35m 3s hadoop-yarn-server-resourcemanager in the patch failed. +1 asflicense 0m 18s The patch does not generate ASF License warnings. 68m 17s Reason Tests Failed junit tests hadoop.yarn.server.resourcemanager.applicationsmanager.TestAMRestart Subsystem Report/Notes Docker Image:yetus/hadoop:9560f25 JIRA Issue YARN-5774 JIRA Patch URL https://issues.apache.org/jira/secure/attachment/12836466/YARN-5774.003.patch Optional Tests asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle uname Linux 36e3974b0136 3.13.0-93-generic #140-Ubuntu SMP Mon Jul 18 21:21:05 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux Build tool maven Personality /testptch/hadoop/patchprocess/precommit/personality/provided.sh git revision trunk / aacf214 Default Java 1.8.0_101 findbugs v3.0.0 unit https://builds.apache.org/job/PreCommit-YARN-Build/13743/artifact/patchprocess/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager.txt Test Results https://builds.apache.org/job/PreCommit-YARN-Build/13743/testReport/ modules C: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager U: hadoop-yarn-project/hadoop-yarn Console output https://builds.apache.org/job/PreCommit-YARN-Build/13743/console Powered by Apache Yetus 0.4.0-SNAPSHOT http://yetus.apache.org This message was automatically generated.
        Hide
        templedf Daniel Templeton added a comment -

        Thanks, Yufei Gu. In addition to Miklos Szegedi's comments, I have a couple of minor points:

        • AbstractYarnScheduler.normailzeRequest(List<...> ask...) should be normalizeRequests() to avoid confusion.
        • While you're in there, you may as well correct the typo (hte/the) in the javadoc for ResourceCalculator.normalize()
        • To add onto Miklos Szegedi's comments, ResourceCalculator.normalize() should check memory and CPU independently. Also, I think you can leave out the 0 check in SchedulerUtils.normalizeRequest() since it's redundant.
        • Is throwing an exception the right thing to do if the min allocation is 0? Looks to me like that exception my be pretty hard to diagnose.
        Show
        templedf Daniel Templeton added a comment - Thanks, Yufei Gu . In addition to Miklos Szegedi 's comments, I have a couple of minor points: AbstractYarnScheduler.normailzeRequest(List<...> ask...) should be normalizeRequests() to avoid confusion. While you're in there, you may as well correct the typo (hte/the) in the javadoc for ResourceCalculator.normalize() To add onto Miklos Szegedi 's comments, ResourceCalculator.normalize() should check memory and CPU independently. Also, I think you can leave out the 0 check in SchedulerUtils.normalizeRequest() since it's redundant. Is throwing an exception the right thing to do if the min allocation is 0? Looks to me like that exception my be pretty hard to diagnose.
        Hide
        yufeigu Yufei Gu added a comment -

        Thanks Daniel's review. I've uploaded patch 004 for your comments.
        For the last comments, I add increment resource sanity check into initScheduler -> validateConf. So if users misconfigure the increment resource in fair scheduler, a detailed error message will show up. The only reason to hit the exception in resource calculator is that programmer of YARN misuse the resource calculator.

        Show
        yufeigu Yufei Gu added a comment - Thanks Daniel's review. I've uploaded patch 004 for your comments. For the last comments, I add increment resource sanity check into initScheduler -> validateConf . So if users misconfigure the increment resource in fair scheduler, a detailed error message will show up. The only reason to hit the exception in resource calculator is that programmer of YARN misuse the resource calculator.
        Hide
        hadoopqa Hadoop QA added a comment -
        +1 overall



        Vote Subsystem Runtime Comment
        0 reexec 0m 17s Docker mode activated.
        +1 @author 0m 0s The patch does not contain any @author tags.
        +1 test4tests 0m 0s The patch appears to include 1 new or modified test files.
        0 mvndep 0m 56s Maven dependency ordering for branch
        +1 mvninstall 6m 52s trunk passed
        +1 compile 2m 17s trunk passed
        +1 checkstyle 0m 42s trunk passed
        +1 mvnsite 1m 9s trunk passed
        +1 mvneclipse 0m 30s trunk passed
        +1 findbugs 1m 50s trunk passed
        +1 javadoc 0m 50s trunk passed
        0 mvndep 0m 10s Maven dependency ordering for patch
        +1 mvninstall 0m 58s the patch passed
        +1 compile 2m 16s the patch passed
        +1 javac 2m 16s the patch passed
        -0 checkstyle 0m 40s hadoop-yarn-project/hadoop-yarn: The patch generated 1 new + 309 unchanged - 11 fixed = 310 total (was 320)
        +1 mvnsite 1m 6s the patch passed
        +1 mvneclipse 0m 29s the patch passed
        +1 whitespace 0m 0s The patch has no whitespace issues.
        +1 findbugs 2m 8s the patch passed
        +1 javadoc 0m 28s hadoop-yarn-common in the patch passed.
        +1 javadoc 0m 21s hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager generated 0 new + 924 unchanged - 13 fixed = 924 total (was 937)
        +1 unit 2m 18s hadoop-yarn-common in the patch passed.
        +1 unit 38m 54s hadoop-yarn-server-resourcemanager in the patch passed.
        +1 asflicense 0m 19s The patch does not generate ASF License warnings.
        73m 14s



        Subsystem Report/Notes
        Docker Image:yetus/hadoop:9560f25
        JIRA Issue YARN-5774
        JIRA Patch URL https://issues.apache.org/jira/secure/attachment/12837214/YARN-5774.004.patch
        Optional Tests asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle
        uname Linux cd82a985cedb 3.13.0-95-generic #142-Ubuntu SMP Fri Aug 12 17:00:09 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux
        Build tool maven
        Personality /testptch/hadoop/patchprocess/precommit/personality/provided.sh
        git revision trunk / abfc15d
        Default Java 1.8.0_101
        findbugs v3.0.0
        checkstyle https://builds.apache.org/job/PreCommit-YARN-Build/13790/artifact/patchprocess/diff-checkstyle-hadoop-yarn-project_hadoop-yarn.txt
        Test Results https://builds.apache.org/job/PreCommit-YARN-Build/13790/testReport/
        modules C: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager U: hadoop-yarn-project/hadoop-yarn
        Console output https://builds.apache.org/job/PreCommit-YARN-Build/13790/console
        Powered by Apache Yetus 0.4.0-SNAPSHOT http://yetus.apache.org

        This message was automatically generated.

        Show
        hadoopqa Hadoop QA added a comment - +1 overall Vote Subsystem Runtime Comment 0 reexec 0m 17s Docker mode activated. +1 @author 0m 0s The patch does not contain any @author tags. +1 test4tests 0m 0s The patch appears to include 1 new or modified test files. 0 mvndep 0m 56s Maven dependency ordering for branch +1 mvninstall 6m 52s trunk passed +1 compile 2m 17s trunk passed +1 checkstyle 0m 42s trunk passed +1 mvnsite 1m 9s trunk passed +1 mvneclipse 0m 30s trunk passed +1 findbugs 1m 50s trunk passed +1 javadoc 0m 50s trunk passed 0 mvndep 0m 10s Maven dependency ordering for patch +1 mvninstall 0m 58s the patch passed +1 compile 2m 16s the patch passed +1 javac 2m 16s the patch passed -0 checkstyle 0m 40s hadoop-yarn-project/hadoop-yarn: The patch generated 1 new + 309 unchanged - 11 fixed = 310 total (was 320) +1 mvnsite 1m 6s the patch passed +1 mvneclipse 0m 29s the patch passed +1 whitespace 0m 0s The patch has no whitespace issues. +1 findbugs 2m 8s the patch passed +1 javadoc 0m 28s hadoop-yarn-common in the patch passed. +1 javadoc 0m 21s hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager generated 0 new + 924 unchanged - 13 fixed = 924 total (was 937) +1 unit 2m 18s hadoop-yarn-common in the patch passed. +1 unit 38m 54s hadoop-yarn-server-resourcemanager in the patch passed. +1 asflicense 0m 19s The patch does not generate ASF License warnings. 73m 14s Subsystem Report/Notes Docker Image:yetus/hadoop:9560f25 JIRA Issue YARN-5774 JIRA Patch URL https://issues.apache.org/jira/secure/attachment/12837214/YARN-5774.004.patch Optional Tests asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle uname Linux cd82a985cedb 3.13.0-95-generic #142-Ubuntu SMP Fri Aug 12 17:00:09 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux Build tool maven Personality /testptch/hadoop/patchprocess/precommit/personality/provided.sh git revision trunk / abfc15d Default Java 1.8.0_101 findbugs v3.0.0 checkstyle https://builds.apache.org/job/PreCommit-YARN-Build/13790/artifact/patchprocess/diff-checkstyle-hadoop-yarn-project_hadoop-yarn.txt Test Results https://builds.apache.org/job/PreCommit-YARN-Build/13790/testReport/ modules C: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager U: hadoop-yarn-project/hadoop-yarn Console output https://builds.apache.org/job/PreCommit-YARN-Build/13790/console Powered by Apache Yetus 0.4.0-SNAPSHOT http://yetus.apache.org This message was automatically generated.
        Hide
        templedf Daniel Templeton added a comment -

        So if users misconfigure the increment resource in fair scheduler, a detailed error message will show up.

        That's great for fair scheduler, but what about CS and FIFO? I'm particularly worried about those because an increment of 0 was not previously treated as invalid. What's the intent of throwing the exception in normalize()? You throw an exception when you want to halt the execution flow and allow some out-of-sequence remedial code to run. In this case, no one is expecting to see this exception, so no one will catch it, and there's no action that needs to be taken in the CS and FIFO cases. The net result is that things will fail in indirect ways. Since it's not a failure for CS and FIFO, I don't think you should throw the exception. In the case of FS, as you point out, the only reason to hit the exception is misuse of the resource calculator.

        Show
        templedf Daniel Templeton added a comment - So if users misconfigure the increment resource in fair scheduler, a detailed error message will show up. That's great for fair scheduler, but what about CS and FIFO? I'm particularly worried about those because an increment of 0 was not previously treated as invalid. What's the intent of throwing the exception in normalize() ? You throw an exception when you want to halt the execution flow and allow some out-of-sequence remedial code to run. In this case, no one is expecting to see this exception, so no one will catch it, and there's no action that needs to be taken in the CS and FIFO cases. The net result is that things will fail in indirect ways. Since it's not a failure for CS and FIFO, I don't think you should throw the exception. In the case of FS, as you point out, the only reason to hit the exception is misuse of the resource calculator.
        Hide
        yufeigu Yufei Gu added a comment -

        Thanks Daniel Templeton for the review.
        In CS, minimum share will serve as an increment share and it cannot be zero, which is guaranteed by the CS sanity check. FIFO doesn't have sanity check. I guess it is because of nobody cares about it. We can definitely add sanity check for FIFO scheduler in this JIRA or followup JIRA. So that's fine for CS, FIFO and FS.
        The real tricky part is in common parts of scheduler(or RM). People who write the code in common parts might not even notice there is an increment share config because CS and FIFO don't have it and FS has it. That is how the issue happens in the very beginning.
        This patch let normalize() throw a runtime exception if increment is 0, which no need to catch and handle. It will fail the RM when it happens. The main reason is that we should consider 0 increment as an invalid configuration according to the offline discussion with Karthik Kambatla.

        Show
        yufeigu Yufei Gu added a comment - Thanks Daniel Templeton for the review. In CS, minimum share will serve as an increment share and it cannot be zero, which is guaranteed by the CS sanity check. FIFO doesn't have sanity check. I guess it is because of nobody cares about it. We can definitely add sanity check for FIFO scheduler in this JIRA or followup JIRA. So that's fine for CS, FIFO and FS. The real tricky part is in common parts of scheduler(or RM). People who write the code in common parts might not even notice there is an increment share config because CS and FIFO don't have it and FS has it. That is how the issue happens in the very beginning. This patch let normalize() throw a runtime exception if increment is 0, which no need to catch and handle. It will fail the RM when it happens. The main reason is that we should consider 0 increment as an invalid configuration according to the offline discussion with Karthik Kambatla .
        Hide
        yufeigu Yufei Gu added a comment -

        File YARN-5880 to add sanity check to FIFO scheduler.

        Show
        yufeigu Yufei Gu added a comment - File YARN-5880 to add sanity check to FIFO scheduler.
        Hide
        templedf Daniel Templeton added a comment -

        I'm still not in love with that exception. Maybe log it as an error and assume the minimum as the increment instead?

        Couple of additional comments:

        Let's make your messages a little clearer. How about:

        "StepFactor memory size cannot be zero!" -> "Memory cannot be allocated in increments of zero. Assuming " + minimumResource.getMemorySize() + "MB increment size. Please ensure the scheduler configuration is correct."

        In DominantResourceCalculator, I think you'll need to test for memory and vcores separately and then use the same message as above.

        I'd also love to see some tests that validate the changes you made.

        It would be good to have javadoc for AbstractYarnScheduler. normalizeRequests().

        There was something else, but I can't think of it now...

        Show
        templedf Daniel Templeton added a comment - I'm still not in love with that exception. Maybe log it as an error and assume the minimum as the increment instead? Couple of additional comments: Let's make your messages a little clearer. How about: "StepFactor memory size cannot be zero!" -> "Memory cannot be allocated in increments of zero. Assuming " + minimumResource.getMemorySize() + "MB increment size. Please ensure the scheduler configuration is correct." In DominantResourceCalculator , I think you'll need to test for memory and vcores separately and then use the same message as above. I'd also love to see some tests that validate the changes you made. It would be good to have javadoc for AbstractYarnScheduler. normalizeRequests() . There was something else, but I can't think of it now...
        Hide
        yufeigu Yufei Gu added a comment -

        Thanks Daniel Templeton for the review. I've uploaded the new patch for your comments.

        Show
        yufeigu Yufei Gu added a comment - Thanks Daniel Templeton for the review. I've uploaded the new patch for your comments.
        Hide
        hadoopqa Hadoop QA added a comment -
        -1 overall



        Vote Subsystem Runtime Comment
        0 reexec 0m 18s Docker mode activated.
        +1 @author 0m 0s The patch does not contain any @author tags.
        +1 test4tests 0m 0s The patch appears to include 2 new or modified test files.
        0 mvndep 0m 9s Maven dependency ordering for branch
        +1 mvninstall 6m 47s trunk passed
        +1 compile 4m 58s trunk passed
        +1 checkstyle 0m 48s trunk passed
        +1 mvnsite 1m 52s trunk passed
        +1 mvneclipse 0m 59s trunk passed
        +1 findbugs 3m 13s trunk passed
        +1 javadoc 1m 24s trunk passed
        0 mvndep 0m 11s Maven dependency ordering for patch
        +1 mvninstall 1m 20s the patch passed
        +1 compile 4m 36s the patch passed
        +1 javac 4m 36s the patch passed
        -0 checkstyle 0m 48s hadoop-yarn-project/hadoop-yarn: The patch generated 5 new + 347 unchanged - 11 fixed = 352 total (was 358)
        +1 mvnsite 1m 48s the patch passed
        +1 mvneclipse 0m 56s the patch passed
        +1 whitespace 0m 0s The patch has no whitespace issues.
        +1 findbugs 3m 32s the patch passed
        +1 javadoc 0m 22s hadoop-yarn-api in the patch passed.
        +1 javadoc 0m 33s hadoop-yarn-common in the patch passed.
        +1 javadoc 0m 26s hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager generated 0 new + 922 unchanged - 13 fixed = 922 total (was 935)
        +1 unit 0m 31s hadoop-yarn-api in the patch passed.
        +1 unit 2m 23s hadoop-yarn-common in the patch passed.
        -1 unit 42m 44s hadoop-yarn-server-resourcemanager in the patch failed.
        -1 asflicense 0m 29s The patch generated 1 ASF License warnings.
        89m 32s



        Reason Tests
        Failed junit tests hadoop.yarn.server.resourcemanager.TestRMRestart



        Subsystem Report/Notes
        Docker Image:yetus/hadoop:a9ad5d6
        JIRA Issue YARN-5774
        JIRA Patch URL https://issues.apache.org/jira/secure/attachment/12839893/YARN-5774.005.patch
        Optional Tests asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle
        uname Linux e5cdfaa44e74 3.13.0-95-generic #142-Ubuntu SMP Fri Aug 12 17:00:09 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux
        Build tool maven
        Personality /testptch/hadoop/patchprocess/precommit/personality/provided.sh
        git revision trunk / 683e0c7
        Default Java 1.8.0_111
        findbugs v3.0.0
        checkstyle https://builds.apache.org/job/PreCommit-YARN-Build/14003/artifact/patchprocess/diff-checkstyle-hadoop-yarn-project_hadoop-yarn.txt
        unit https://builds.apache.org/job/PreCommit-YARN-Build/14003/artifact/patchprocess/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager.txt
        Test Results https://builds.apache.org/job/PreCommit-YARN-Build/14003/testReport/
        asflicense https://builds.apache.org/job/PreCommit-YARN-Build/14003/artifact/patchprocess/patch-asflicense-problems.txt
        modules C: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager U: hadoop-yarn-project/hadoop-yarn
        Console output https://builds.apache.org/job/PreCommit-YARN-Build/14003/console
        Powered by Apache Yetus 0.4.0-SNAPSHOT http://yetus.apache.org

        This message was automatically generated.

        Show
        hadoopqa Hadoop QA added a comment - -1 overall Vote Subsystem Runtime Comment 0 reexec 0m 18s Docker mode activated. +1 @author 0m 0s The patch does not contain any @author tags. +1 test4tests 0m 0s The patch appears to include 2 new or modified test files. 0 mvndep 0m 9s Maven dependency ordering for branch +1 mvninstall 6m 47s trunk passed +1 compile 4m 58s trunk passed +1 checkstyle 0m 48s trunk passed +1 mvnsite 1m 52s trunk passed +1 mvneclipse 0m 59s trunk passed +1 findbugs 3m 13s trunk passed +1 javadoc 1m 24s trunk passed 0 mvndep 0m 11s Maven dependency ordering for patch +1 mvninstall 1m 20s the patch passed +1 compile 4m 36s the patch passed +1 javac 4m 36s the patch passed -0 checkstyle 0m 48s hadoop-yarn-project/hadoop-yarn: The patch generated 5 new + 347 unchanged - 11 fixed = 352 total (was 358) +1 mvnsite 1m 48s the patch passed +1 mvneclipse 0m 56s the patch passed +1 whitespace 0m 0s The patch has no whitespace issues. +1 findbugs 3m 32s the patch passed +1 javadoc 0m 22s hadoop-yarn-api in the patch passed. +1 javadoc 0m 33s hadoop-yarn-common in the patch passed. +1 javadoc 0m 26s hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager generated 0 new + 922 unchanged - 13 fixed = 922 total (was 935) +1 unit 0m 31s hadoop-yarn-api in the patch passed. +1 unit 2m 23s hadoop-yarn-common in the patch passed. -1 unit 42m 44s hadoop-yarn-server-resourcemanager in the patch failed. -1 asflicense 0m 29s The patch generated 1 ASF License warnings. 89m 32s Reason Tests Failed junit tests hadoop.yarn.server.resourcemanager.TestRMRestart Subsystem Report/Notes Docker Image:yetus/hadoop:a9ad5d6 JIRA Issue YARN-5774 JIRA Patch URL https://issues.apache.org/jira/secure/attachment/12839893/YARN-5774.005.patch Optional Tests asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle uname Linux e5cdfaa44e74 3.13.0-95-generic #142-Ubuntu SMP Fri Aug 12 17:00:09 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux Build tool maven Personality /testptch/hadoop/patchprocess/precommit/personality/provided.sh git revision trunk / 683e0c7 Default Java 1.8.0_111 findbugs v3.0.0 checkstyle https://builds.apache.org/job/PreCommit-YARN-Build/14003/artifact/patchprocess/diff-checkstyle-hadoop-yarn-project_hadoop-yarn.txt unit https://builds.apache.org/job/PreCommit-YARN-Build/14003/artifact/patchprocess/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager.txt Test Results https://builds.apache.org/job/PreCommit-YARN-Build/14003/testReport/ asflicense https://builds.apache.org/job/PreCommit-YARN-Build/14003/artifact/patchprocess/patch-asflicense-problems.txt modules C: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager U: hadoop-yarn-project/hadoop-yarn Console output https://builds.apache.org/job/PreCommit-YARN-Build/14003/console Powered by Apache Yetus 0.4.0-SNAPSHOT http://yetus.apache.org This message was automatically generated.
        Hide
        yufeigu Yufei Gu added a comment -

        Uploaded patch 006 for the style issues and the test failure is unrelated.

        Show
        yufeigu Yufei Gu added a comment - Uploaded patch 006 for the style issues and the test failure is unrelated.
        Hide
        templedf Daniel Templeton added a comment -

        Thanks for making those changes. A few more comments:

        • The patch needs a rebase.
        • In DominantResourceCalculator.normalize(), would it make more sense to only replace the zero part of the step size with that part of the minimum, e.g. only the CPU if the memory is fine?
        • I kinda want to make NormalizableRequest an abstract base class instead of an interface. The fact that the requests are asking for resources is core to what they do. I would also like to find a better name, because the fact that the things can be normalized is incidental. The main point is that they request resources. Unfortunately ResourceRequest is already taken. Maybe ContainerRequest? Maybe rename ResourceRequest to ContainerRequest and reuse ResourceRequest? Karthik Kambatla, any opinion here?
        • NormalizableRequest could use a little more explanation in the class javadoc and some method javadocs.
        • ResourceRequest and UpdateContainerRequest should have getCapacity() and setCapacity() set as Override
        • In TestResourceCalculator.testNormalize(), it would be nice to test normalization to a value that is more than the minimum, e.g. min=2, max=8, ask=3.
        Show
        templedf Daniel Templeton added a comment - Thanks for making those changes. A few more comments: The patch needs a rebase. In DominantResourceCalculator.normalize() , would it make more sense to only replace the zero part of the step size with that part of the minimum, e.g. only the CPU if the memory is fine? I kinda want to make NormalizableRequest an abstract base class instead of an interface. The fact that the requests are asking for resources is core to what they do. I would also like to find a better name, because the fact that the things can be normalized is incidental. The main point is that they request resources. Unfortunately ResourceRequest is already taken. Maybe ContainerRequest ? Maybe rename ResourceRequest to ContainerRequest and reuse ResourceRequest ? Karthik Kambatla , any opinion here? NormalizableRequest could use a little more explanation in the class javadoc and some method javadocs. ResourceRequest and UpdateContainerRequest should have getCapacity() and setCapacity() set as Override In TestResourceCalculator.testNormalize() , it would be nice to test normalization to a value that is more than the minimum, e.g. min=2, max=8, ask=3.
        Hide
        templedf Daniel Templeton added a comment -

        Yufei Gu, can you take a look at YARN-5795 and see 1) what you think of it, and 2) if that's an issue you can also tackle in this patch? Thanks!

        Show
        templedf Daniel Templeton added a comment - Yufei Gu , can you take a look at YARN-5795 and see 1) what you think of it, and 2) if that's an issue you can also tackle in this patch? Thanks!
        Hide
        yufeigu Yufei Gu added a comment -

        Daniel Templeton, sure, and thanks for the review.

        Show
        yufeigu Yufei Gu added a comment - Daniel Templeton , sure, and thanks for the review.
        Hide
        yufeigu Yufei Gu added a comment -

        Daniel Templeton, I've uploaded the patch 007 for all your comments except the last one since the first case in testNormalize covers that.
        For YARN-5795, this patch will solve the issue incidentally but we can do more in that JIRA as I commented in it.

        Show
        yufeigu Yufei Gu added a comment - Daniel Templeton , I've uploaded the patch 007 for all your comments except the last one since the first case in testNormalize covers that. For YARN-5795 , this patch will solve the issue incidentally but we can do more in that JIRA as I commented in it.
        Hide
        hadoopqa Hadoop QA added a comment -
        +1 overall



        Vote Subsystem Runtime Comment
        0 reexec 0m 14s Docker mode activated.
        +1 @author 0m 0s The patch does not contain any @author tags.
        +1 test4tests 0m 0s The patch appears to include 2 new or modified test files.
        0 mvndep 1m 1s Maven dependency ordering for branch
        +1 mvninstall 7m 14s trunk passed
        +1 compile 5m 8s trunk passed
        +1 checkstyle 0m 48s trunk passed
        +1 mvnsite 1m 55s trunk passed
        +1 mvneclipse 0m 57s trunk passed
        +1 findbugs 3m 19s trunk passed
        +1 javadoc 1m 23s trunk passed
        0 mvndep 0m 10s Maven dependency ordering for patch
        +1 mvninstall 1m 24s the patch passed
        +1 compile 4m 47s the patch passed
        +1 javac 4m 47s the patch passed
        -0 checkstyle 0m 47s hadoop-yarn-project/hadoop-yarn: The patch generated 1 new + 344 unchanged - 12 fixed = 345 total (was 356)
        +1 mvnsite 1m 46s the patch passed
        +1 mvneclipse 0m 55s the patch passed
        +1 whitespace 0m 0s The patch has no whitespace issues.
        +1 findbugs 3m 36s the patch passed
        +1 javadoc 0m 22s hadoop-yarn-api in the patch passed.
        +1 javadoc 0m 31s hadoop-yarn-common in the patch passed.
        +1 javadoc 0m 26s hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager generated 0 new + 922 unchanged - 13 fixed = 922 total (was 935)
        +1 unit 0m 30s hadoop-yarn-api in the patch passed.
        +1 unit 2m 23s hadoop-yarn-common in the patch passed.
        +1 unit 39m 18s hadoop-yarn-server-resourcemanager in the patch passed.
        +1 asflicense 0m 29s The patch does not generate ASF License warnings.
        87m 42s



        Subsystem Report/Notes
        Docker Image:yetus/hadoop:a9ad5d6
        JIRA Issue YARN-5774
        JIRA Patch URL https://issues.apache.org/jira/secure/attachment/12840362/YARN-5774.007.patch
        Optional Tests asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle
        uname Linux 8e764120d7ad 3.13.0-92-generic #139-Ubuntu SMP Tue Jun 28 20:42:26 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux
        Build tool maven
        Personality /testptch/hadoop/patchprocess/precommit/personality/provided.sh
        git revision trunk / c7a5f29
        Default Java 1.8.0_111
        findbugs v3.0.0
        checkstyle https://builds.apache.org/job/PreCommit-YARN-Build/14070/artifact/patchprocess/diff-checkstyle-hadoop-yarn-project_hadoop-yarn.txt
        Test Results https://builds.apache.org/job/PreCommit-YARN-Build/14070/testReport/
        modules C: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager U: hadoop-yarn-project/hadoop-yarn
        Console output https://builds.apache.org/job/PreCommit-YARN-Build/14070/console
        Powered by Apache Yetus 0.4.0-SNAPSHOT http://yetus.apache.org

        This message was automatically generated.

        Show
        hadoopqa Hadoop QA added a comment - +1 overall Vote Subsystem Runtime Comment 0 reexec 0m 14s Docker mode activated. +1 @author 0m 0s The patch does not contain any @author tags. +1 test4tests 0m 0s The patch appears to include 2 new or modified test files. 0 mvndep 1m 1s Maven dependency ordering for branch +1 mvninstall 7m 14s trunk passed +1 compile 5m 8s trunk passed +1 checkstyle 0m 48s trunk passed +1 mvnsite 1m 55s trunk passed +1 mvneclipse 0m 57s trunk passed +1 findbugs 3m 19s trunk passed +1 javadoc 1m 23s trunk passed 0 mvndep 0m 10s Maven dependency ordering for patch +1 mvninstall 1m 24s the patch passed +1 compile 4m 47s the patch passed +1 javac 4m 47s the patch passed -0 checkstyle 0m 47s hadoop-yarn-project/hadoop-yarn: The patch generated 1 new + 344 unchanged - 12 fixed = 345 total (was 356) +1 mvnsite 1m 46s the patch passed +1 mvneclipse 0m 55s the patch passed +1 whitespace 0m 0s The patch has no whitespace issues. +1 findbugs 3m 36s the patch passed +1 javadoc 0m 22s hadoop-yarn-api in the patch passed. +1 javadoc 0m 31s hadoop-yarn-common in the patch passed. +1 javadoc 0m 26s hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager generated 0 new + 922 unchanged - 13 fixed = 922 total (was 935) +1 unit 0m 30s hadoop-yarn-api in the patch passed. +1 unit 2m 23s hadoop-yarn-common in the patch passed. +1 unit 39m 18s hadoop-yarn-server-resourcemanager in the patch passed. +1 asflicense 0m 29s The patch does not generate ASF License warnings. 87m 42s Subsystem Report/Notes Docker Image:yetus/hadoop:a9ad5d6 JIRA Issue YARN-5774 JIRA Patch URL https://issues.apache.org/jira/secure/attachment/12840362/YARN-5774.007.patch Optional Tests asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle uname Linux 8e764120d7ad 3.13.0-92-generic #139-Ubuntu SMP Tue Jun 28 20:42:26 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux Build tool maven Personality /testptch/hadoop/patchprocess/precommit/personality/provided.sh git revision trunk / c7a5f29 Default Java 1.8.0_111 findbugs v3.0.0 checkstyle https://builds.apache.org/job/PreCommit-YARN-Build/14070/artifact/patchprocess/diff-checkstyle-hadoop-yarn-project_hadoop-yarn.txt Test Results https://builds.apache.org/job/PreCommit-YARN-Build/14070/testReport/ modules C: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager U: hadoop-yarn-project/hadoop-yarn Console output https://builds.apache.org/job/PreCommit-YARN-Build/14070/console Powered by Apache Yetus 0.4.0-SNAPSHOT http://yetus.apache.org This message was automatically generated.
        Hide
        templedf Daniel Templeton added a comment -

        Thanks for the update, Yufei Gu! I only see one tiny nit: in the AbstractResourceRequest javadocs,

        {@code ResourceManager}

        should be

        {@link ResourceManager}
        Show
        templedf Daniel Templeton added a comment - Thanks for the update, Yufei Gu ! I only see one tiny nit: in the AbstractResourceRequest javadocs, {@code ResourceManager} should be {@link ResourceManager}
        Hide
        yufeigu Yufei Gu added a comment -

        Thanks Daniel Templeton. The reason to use code instead of link is to avoid cyclic dependency. AbstractResourceRequest is in project hadoop-yarn-api, and ResourceManager is in project hadoop-yarn-server which depends on project hadoop-yarn-api. If we use link, we need to let project hadoop-yarn-api depends on project hadoop-yarn-server and introduce cyclic dependency.

        Show
        yufeigu Yufei Gu added a comment - Thanks Daniel Templeton . The reason to use code instead of link is to avoid cyclic dependency. AbstractResourceRequest is in project hadoop-yarn-api, and ResourceManager is in project hadoop-yarn-server which depends on project hadoop-yarn-api. If we use link, we need to let project hadoop-yarn-api depends on project hadoop-yarn-server and introduce cyclic dependency.
        Hide
        templedf Daniel Templeton added a comment -

        Good point. +1 then.

        Show
        templedf Daniel Templeton added a comment - Good point. +1 then.
        Hide
        templedf Daniel Templeton added a comment -

        Thanks for the patch, Yufei Gu! Committed to trunk and branch-2.

        Show
        templedf Daniel Templeton added a comment - Thanks for the patch, Yufei Gu ! Committed to trunk and branch-2.
        Hide
        yufeigu Yufei Gu added a comment -

        Thanks a lot for the review and commit, Daniel Templeton!

        Show
        yufeigu Yufei Gu added a comment - Thanks a lot for the review and commit, Daniel Templeton !
        Hide
        hudson Hudson added a comment -

        SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #10906 (See https://builds.apache.org/job/Hadoop-trunk-Commit/10906/)
        YARN-5774. MR Job stuck in ACCEPTED status without any progress in Fair (templedf: rev 25f9872be63423ada6a18481eaad2888e731fdac)

        • (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/util/resource/DefaultResourceCalculator.java
        • (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/SchedulerUtils.java
        • (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/util/resource/DominantResourceCalculator.java
        • (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/FairScheduler.java
        • (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/AbstractYarnScheduler.java
        • (add) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/api/records/AbstractResourceRequest.java
        • (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/YarnScheduler.java
        • (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/RMAppManager.java
        • (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/TestSchedulerUtils.java
        • (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacityScheduler.java
        • (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/util/resource/ResourceCalculator.java
        • (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/api/records/ResourceRequest.java
        • (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/RMServerUtils.java
        • (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fifo/FifoScheduler.java
        • (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/test/java/org/apache/hadoop/yarn/util/resource/TestResourceCalculator.java
        • (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/api/records/UpdateContainerRequest.java
        Show
        hudson Hudson added a comment - SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #10906 (See https://builds.apache.org/job/Hadoop-trunk-Commit/10906/ ) YARN-5774 . MR Job stuck in ACCEPTED status without any progress in Fair (templedf: rev 25f9872be63423ada6a18481eaad2888e731fdac) (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/util/resource/DefaultResourceCalculator.java (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/SchedulerUtils.java (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/util/resource/DominantResourceCalculator.java (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/FairScheduler.java (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/AbstractYarnScheduler.java (add) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/api/records/AbstractResourceRequest.java (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/YarnScheduler.java (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/RMAppManager.java (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/TestSchedulerUtils.java (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacityScheduler.java (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/util/resource/ResourceCalculator.java (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/api/records/ResourceRequest.java (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/RMServerUtils.java (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fifo/FifoScheduler.java (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/test/java/org/apache/hadoop/yarn/util/resource/TestResourceCalculator.java (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/api/records/UpdateContainerRequest.java
        Hide
        leftnoteasy Wangda Tan added a comment -

        Hi Yufei Gu, Daniel Templeton,

        I noticed this change while reviewing YARN-5959. This change looks like an incompatible change to me, it removed public/stable method from ResourceRequest, which makes API doc different. And now ResourceRequest (public/stable) inherits from a public/unstable class.

        I can understand it is added for making normalizeResource to handle both of updateContainerRequest / ResourceRequest. But changing user facing API to simplify internal implementation doesn't look like a good idea to me.

        Instead, I suggest to revert the AbstractResourceRequest change, and update void normalizeRequest(AbstractResourceRequest request); to Resource getNormalizedResource(Resource askedResource) to make "what to normalize" more clear.

        Thoughts?

        Show
        leftnoteasy Wangda Tan added a comment - Hi Yufei Gu , Daniel Templeton , I noticed this change while reviewing YARN-5959 . This change looks like an incompatible change to me, it removed public/stable method from ResourceRequest, which makes API doc different. And now ResourceRequest (public/stable) inherits from a public/unstable class. I can understand it is added for making normalizeResource to handle both of updateContainerRequest / ResourceRequest. But changing user facing API to simplify internal implementation doesn't look like a good idea to me. Instead, I suggest to revert the AbstractResourceRequest change, and update void normalizeRequest(AbstractResourceRequest request); to Resource getNormalizedResource(Resource askedResource) to make "what to normalize" more clear. Thoughts?
        Hide
        yufeigu Yufei Gu added a comment -

        Hi Tan, Wangda, I understand what you mean.
        I don't think it is an incompatible change. The setCapability() and getCapability() are still there and public/stable in each class. The only problem I see is that AbstractResourceRequest should be public/stable instead of public/unstable. I'll file a jira for that.

        Show
        yufeigu Yufei Gu added a comment - Hi Tan, Wangda , I understand what you mean. I don't think it is an incompatible change. The setCapability() and getCapability() are still there and public/stable in each class. The only problem I see is that AbstractResourceRequest should be public/stable instead of public/unstable. I'll file a jira for that.
        Hide
        leftnoteasy Wangda Tan added a comment -

        Yufei Gu,

        It is, you can run "mvn javadoc:javadoc". It now listed AbstractResourceRequest to be a public interface, and getCapacity missing from ResourceRequest.

        And in addition, this change really looks strange, let's try to make API clean as before.

        Reopen the JIRA and marked it to be 2.8.0 blocker.

        Show
        leftnoteasy Wangda Tan added a comment - Yufei Gu , It is, you can run "mvn javadoc:javadoc". It now listed AbstractResourceRequest to be a public interface, and getCapacity missing from ResourceRequest. And in addition, this change really looks strange, let's try to make API clean as before. Reopen the JIRA and marked it to be 2.8.0 blocker.
        Hide
        yufeigu Yufei Gu added a comment -

        Hi Tan, Wangda, I understand it looks like different in doc. But ResourceRequest just inherits from AbstractResourceRequest and doesn't lose anything. All the functions and their property(public/stable) are still there. Any user code can interact with ResourceRequest without changes. What Apache Hadoop Compatibility document said for a Public-Stable Java API is strictly guaranteed.

        Public-Stable API compatibility is required to ensure end-user programs and downstream projects continue to work without modification.

        However, if you don't like the solution, we can always file a new JIRA for a potential better solution. Anybody interested can work on it.

        Show
        yufeigu Yufei Gu added a comment - Hi Tan, Wangda , I understand it looks like different in doc. But ResourceRequest just inherits from AbstractResourceRequest and doesn't lose anything. All the functions and their property(public/stable) are still there. Any user code can interact with ResourceRequest without changes. What Apache Hadoop Compatibility document said for a Public-Stable Java API is strictly guaranteed. Public-Stable API compatibility is required to ensure end-user programs and downstream projects continue to work without modification. However, if you don't like the solution, we can always file a new JIRA for a potential better solution. Anybody interested can work on it.
        Hide
        leftnoteasy Wangda Tan added a comment -

        Yufei Gu, thanks for pointing the policy, no matter if it violated policy or not, this is not a good change to me, and it should not be shipped. I will create a new JIRA to track the changes, closing this one.

        Show
        leftnoteasy Wangda Tan added a comment - Yufei Gu , thanks for pointing the policy, no matter if it violated policy or not, this is not a good change to me, and it should not be shipped. I will create a new JIRA to track the changes, closing this one.
        Hide
        djp Junping Du added a comment -

        I just verified that this patch is not in branch-2.8. Set fix version to 2.9 given narrow time window for 2.8 and the correct fix is still in discussion.

        Show
        djp Junping Du added a comment - I just verified that this patch is not in branch-2.8. Set fix version to 2.9 given narrow time window for 2.8 and the correct fix is still in discussion.

          People

          • Assignee:
            yufeigu Yufei Gu
            Reporter:
            yufeigu Yufei Gu
          • Votes:
            0 Vote for this issue
            Watchers:
            11 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development