Uploaded image for project: 'Hadoop YARN'
  1. Hadoop YARN
  2. YARN-3848

TestNodeLabelContainerAllocation is not timing out

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 2.8.0, 3.0.0-alpha2
    • Component/s: test
    • Labels:
      None
    • Hadoop Flags:
      Reviewed

      Description

      A number of builds, pre-commit and otherwise, have been failing recently because TestNodeLabelContainerAllocation has timed out. See https://builds.apache.org/job/Hadoop-Yarn-trunk/969/, YARN-3830, YARN-3802, or YARN-3826 for examples.

      1. test_output.txt
        499 kB
        Varun Saxena
      2. YARN-3848.01.patch
        3 kB
        Varun Saxena
      3. YARN-3848.02.patch
        1 kB
        Varun Saxena

        Activity

        Hide
        varun_saxena Varun Saxena added a comment -

        Test which is failing is testQueueMaxCapacitiesWillNotBeHonoredWhenNotRespectingExclusivity. Test output has been attached.
        Basically MockRM is being stopped while dispatcher still has events in its queue which leads to InterruptedException. JUnit wrongly interprets this as timeout, even though it isn't.

        Show
        varun_saxena Varun Saxena added a comment - Test which is failing is testQueueMaxCapacitiesWillNotBeHonoredWhenNotRespectingExclusivity . Test output has been attached. Basically MockRM is being stopped while dispatcher still has events in its queue which leads to InterruptedException. JUnit wrongly interprets this as timeout, even though it isn't.
        Hide
        varun_saxena Varun Saxena added a comment -

        I mean the test does not have timeout.

        Show
        varun_saxena Varun Saxena added a comment - I mean the test does not have timeout.
        Hide
        varun_saxena Varun Saxena added a comment -

        Could have put a sleep in the test. But checked for dispatcher queue being drained instead.

        Show
        varun_saxena Varun Saxena added a comment - Could have put a sleep in the test. But checked for dispatcher queue being drained instead.
        Hide
        hadoopqa Hadoop QA added a comment -



        +1 overall



        Vote Subsystem Runtime Comment
        0 pre-patch 17m 20s Pre-patch trunk compilation is healthy.
        +1 @author 0m 0s The patch does not contain any @author tags.
        +1 tests included 0m 0s The patch appears to include 1 new or modified test files.
        +1 javac 7m 38s There were no new javac warning messages.
        +1 javadoc 9m 34s There were no new javadoc warning messages.
        +1 release audit 0m 23s The applied patch does not increase the total number of release audit warnings.
        +1 checkstyle 1m 30s There were no new checkstyle issues.
        +1 whitespace 0m 0s The patch has no lines that end in whitespace.
        +1 install 1m 36s mvn install still works.
        +1 eclipse:eclipse 0m 33s The patch built with eclipse:eclipse.
        +1 findbugs 2m 59s The patch does not introduce any new Findbugs (version 3.0.0) warnings.
        +1 yarn tests 1m 57s Tests passed in hadoop-yarn-common.
        +1 yarn tests 50m 53s Tests passed in hadoop-yarn-server-resourcemanager.
            94m 27s  



        Subsystem Report/Notes
        Patch URL http://issues.apache.org/jira/secure/attachment/12742338/YARN-3848.01.patch
        Optional Tests javadoc javac unit findbugs checkstyle
        git revision trunk / 79ed0f9
        hadoop-yarn-common test log https://builds.apache.org/job/PreCommit-YARN-Build/8362/artifact/patchprocess/testrun_hadoop-yarn-common.txt
        hadoop-yarn-server-resourcemanager test log https://builds.apache.org/job/PreCommit-YARN-Build/8362/artifact/patchprocess/testrun_hadoop-yarn-server-resourcemanager.txt
        Test Results https://builds.apache.org/job/PreCommit-YARN-Build/8362/testReport/
        Java 1.7.0_55
        uname Linux asf905.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux
        Console output https://builds.apache.org/job/PreCommit-YARN-Build/8362/console

        This message was automatically generated.

        Show
        hadoopqa Hadoop QA added a comment - +1 overall Vote Subsystem Runtime Comment 0 pre-patch 17m 20s Pre-patch trunk compilation is healthy. +1 @author 0m 0s The patch does not contain any @author tags. +1 tests included 0m 0s The patch appears to include 1 new or modified test files. +1 javac 7m 38s There were no new javac warning messages. +1 javadoc 9m 34s There were no new javadoc warning messages. +1 release audit 0m 23s The applied patch does not increase the total number of release audit warnings. +1 checkstyle 1m 30s There were no new checkstyle issues. +1 whitespace 0m 0s The patch has no lines that end in whitespace. +1 install 1m 36s mvn install still works. +1 eclipse:eclipse 0m 33s The patch built with eclipse:eclipse. +1 findbugs 2m 59s The patch does not introduce any new Findbugs (version 3.0.0) warnings. +1 yarn tests 1m 57s Tests passed in hadoop-yarn-common. +1 yarn tests 50m 53s Tests passed in hadoop-yarn-server-resourcemanager.     94m 27s   Subsystem Report/Notes Patch URL http://issues.apache.org/jira/secure/attachment/12742338/YARN-3848.01.patch Optional Tests javadoc javac unit findbugs checkstyle git revision trunk / 79ed0f9 hadoop-yarn-common test log https://builds.apache.org/job/PreCommit-YARN-Build/8362/artifact/patchprocess/testrun_hadoop-yarn-common.txt hadoop-yarn-server-resourcemanager test log https://builds.apache.org/job/PreCommit-YARN-Build/8362/artifact/patchprocess/testrun_hadoop-yarn-server-resourcemanager.txt Test Results https://builds.apache.org/job/PreCommit-YARN-Build/8362/testReport/ Java 1.7.0_55 uname Linux asf905.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux Console output https://builds.apache.org/job/PreCommit-YARN-Build/8362/console This message was automatically generated.
        Hide
        varun_saxena Varun Saxena added a comment -

        Jason Lowe/Wangda Tan, kindly review

        Show
        varun_saxena Varun Saxena added a comment - Jason Lowe / Wangda Tan , kindly review
        Hide
        leftnoteasy Wangda Tan added a comment -

        Hi Varun Saxena,

        Thanks for working on this patch.

        Trying to understand the problem, do you know why the dispatcher cannot be drained? It seems a bigger issue if the dispatcher cannot be drained, do you know what's the reason of it? Is it possible to attach some jstack trace if you can reproduce it? I tried to run locally many (10+) times, but I cannot reproduce.

        Adding timeout to the tests makes sense to me.

        Show
        leftnoteasy Wangda Tan added a comment - Hi Varun Saxena , Thanks for working on this patch. Trying to understand the problem, do you know why the dispatcher cannot be drained? It seems a bigger issue if the dispatcher cannot be drained, do you know what's the reason of it? Is it possible to attach some jstack trace if you can reproduce it? I tried to run locally many (10+) times, but I cannot reproduce. Adding timeout to the tests makes sense to me.
        Hide
        leftnoteasy Wangda Tan added a comment -

        Varun Saxena, could you take a look at my previous comment? I want to understand if this is the correct fix.

        Thanks,

        Show
        leftnoteasy Wangda Tan added a comment - Varun Saxena , could you take a look at my previous comment? I want to understand if this is the correct fix. Thanks,
        Hide
        varun_saxena Varun Saxena added a comment -

        Wangda Tan, found the root cause. Its due to some bug in AsyncDispatcher. Will raise another JIRA for it.

        Show
        varun_saxena Varun Saxena added a comment - Wangda Tan , found the root cause. Its due to some bug in AsyncDispatcher. Will raise another JIRA for it.
        Hide
        leftnoteasy Wangda Tan added a comment -

        Thanks Varun Saxena.

        I suggest to do the right fix in a separated JIRA (if the fix won't take too much time).

        If you plan to add some fixes to TestNodeLabelContainerAllocation, I suggest you can add some timeout checking in this JIRA.

        Thoughts?

        Show
        leftnoteasy Wangda Tan added a comment - Thanks Varun Saxena . I suggest to do the right fix in a separated JIRA (if the fix won't take too much time). If you plan to add some fixes to TestNodeLabelContainerAllocation, I suggest you can add some timeout checking in this JIRA. Thoughts?
        Hide
        varun_saxena Varun Saxena added a comment -

        Wangda Tan, you mean add timeout to test case ?
        Only the one which had the problem or all ?

        Show
        varun_saxena Varun Saxena added a comment - Wangda Tan , you mean add timeout to test case ? Only the one which had the problem or all ?
        Hide
        stevel@apache.org Steve Loughran added a comment -

        Everywhere. You can do this with a single

          @Rule
          public Timeout testTimeout = new Timeout( 60 * 1000);
        
        Show
        stevel@apache.org Steve Loughran added a comment - Everywhere. You can do this with a single @Rule public Timeout testTimeout = new Timeout( 60 * 1000);
        Hide
        hadoopqa Hadoop QA added a comment -



        -1 overall



        Vote Subsystem Runtime Comment
        -1 pre-patch 17m 37s Findbugs (version 3.0.0) appears to be broken on trunk.
        +1 @author 0m 0s The patch does not contain any @author tags.
        +1 tests included 0m 0s The patch appears to include 1 new or modified test files.
        +1 javac 7m 57s There were no new javac warning messages.
        +1 javadoc 10m 27s There were no new javadoc warning messages.
        +1 release audit 0m 23s The applied patch does not increase the total number of release audit warnings.
        +1 checkstyle 1m 10s There were no new checkstyle issues.
        +1 whitespace 0m 0s The patch has no lines that end in whitespace.
        +1 install 1m 35s mvn install still works.
        +1 eclipse:eclipse 0m 34s The patch built with eclipse:eclipse.
        +1 findbugs 3m 6s The patch does not introduce any new Findbugs (version 3.0.0) warnings.
        +1 yarn tests 2m 2s Tests passed in hadoop-yarn-common.
        +1 yarn tests 57m 7s Tests passed in hadoop-yarn-server-resourcemanager.
            102m 1s  



        Subsystem Report/Notes
        Patch URL http://issues.apache.org/jira/secure/attachment/12742338/YARN-3848.01.patch
        Optional Tests javadoc javac unit findbugs checkstyle
        git revision trunk / aa299ec
        hadoop-yarn-common test log https://builds.apache.org/job/PreCommit-YARN-Build/9432/artifact/patchprocess/testrun_hadoop-yarn-common.txt
        hadoop-yarn-server-resourcemanager test log https://builds.apache.org/job/PreCommit-YARN-Build/9432/artifact/patchprocess/testrun_hadoop-yarn-server-resourcemanager.txt
        Test Results https://builds.apache.org/job/PreCommit-YARN-Build/9432/testReport/
        Java 1.7.0_55
        uname Linux asf906.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux
        Console output https://builds.apache.org/job/PreCommit-YARN-Build/9432/console

        This message was automatically generated.

        Show
        hadoopqa Hadoop QA added a comment - -1 overall Vote Subsystem Runtime Comment -1 pre-patch 17m 37s Findbugs (version 3.0.0) appears to be broken on trunk. +1 @author 0m 0s The patch does not contain any @author tags. +1 tests included 0m 0s The patch appears to include 1 new or modified test files. +1 javac 7m 57s There were no new javac warning messages. +1 javadoc 10m 27s There were no new javadoc warning messages. +1 release audit 0m 23s The applied patch does not increase the total number of release audit warnings. +1 checkstyle 1m 10s There were no new checkstyle issues. +1 whitespace 0m 0s The patch has no lines that end in whitespace. +1 install 1m 35s mvn install still works. +1 eclipse:eclipse 0m 34s The patch built with eclipse:eclipse. +1 findbugs 3m 6s The patch does not introduce any new Findbugs (version 3.0.0) warnings. +1 yarn tests 2m 2s Tests passed in hadoop-yarn-common. +1 yarn tests 57m 7s Tests passed in hadoop-yarn-server-resourcemanager.     102m 1s   Subsystem Report/Notes Patch URL http://issues.apache.org/jira/secure/attachment/12742338/YARN-3848.01.patch Optional Tests javadoc javac unit findbugs checkstyle git revision trunk / aa299ec hadoop-yarn-common test log https://builds.apache.org/job/PreCommit-YARN-Build/9432/artifact/patchprocess/testrun_hadoop-yarn-common.txt hadoop-yarn-server-resourcemanager test log https://builds.apache.org/job/PreCommit-YARN-Build/9432/artifact/patchprocess/testrun_hadoop-yarn-server-resourcemanager.txt Test Results https://builds.apache.org/job/PreCommit-YARN-Build/9432/testReport/ Java 1.7.0_55 uname Linux asf906.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux Console output https://builds.apache.org/job/PreCommit-YARN-Build/9432/console This message was automatically generated.
        Hide
        varun_saxena Varun Saxena added a comment -

        Thanks Steve Loughran for your suggestion.

        Show
        varun_saxena Varun Saxena added a comment - Thanks Steve Loughran for your suggestion.
        Hide
        varun_saxena Varun Saxena added a comment -

        If you plan to add some fixes to TestNodeLabelContainerAllocation, I suggest you can add some timeout checking in this JIRA.

        Wangda Tan, you want me to add a timeout to the test case ? Or should I close the JIRA ?

        Show
        varun_saxena Varun Saxena added a comment - If you plan to add some fixes to TestNodeLabelContainerAllocation, I suggest you can add some timeout checking in this JIRA. Wangda Tan , you want me to add a timeout to the test case ? Or should I close the JIRA ?
        Hide
        Naganarasimha Naganarasimha G R added a comment -

        Varun Saxena, I feel timeout is sufficient and other changes can be avoided

        Show
        Naganarasimha Naganarasimha G R added a comment - Varun Saxena , I feel timeout is sufficient and other changes can be avoided
        Hide
        hadoopqa Hadoop QA added a comment -
        -1 overall



        Vote Subsystem Runtime Comment
        0 reexec 0m 0s Docker mode activated.
        -1 patch 0m 4s YARN-3848 does not apply to trunk. Rebase required? Wrong Branch? See https://wiki.apache.org/hadoop/HowToContribute for help.



        Subsystem Report/Notes
        JIRA Issue YARN-3848
        JIRA Patch URL https://issues.apache.org/jira/secure/attachment/12742338/YARN-3848.01.patch
        Console output https://builds.apache.org/job/PreCommit-YARN-Build/13546/console
        Powered by Apache Yetus 0.4.0-SNAPSHOT http://yetus.apache.org

        This message was automatically generated.

        Show
        hadoopqa Hadoop QA added a comment - -1 overall Vote Subsystem Runtime Comment 0 reexec 0m 0s Docker mode activated. -1 patch 0m 4s YARN-3848 does not apply to trunk. Rebase required? Wrong Branch? See https://wiki.apache.org/hadoop/HowToContribute for help. Subsystem Report/Notes JIRA Issue YARN-3848 JIRA Patch URL https://issues.apache.org/jira/secure/attachment/12742338/YARN-3848.01.patch Console output https://builds.apache.org/job/PreCommit-YARN-Build/13546/console Powered by Apache Yetus 0.4.0-SNAPSHOT http://yetus.apache.org This message was automatically generated.
        Hide
        hadoopqa Hadoop QA added a comment -
        +1 overall



        Vote Subsystem Runtime Comment
        0 reexec 0m 17s Docker mode activated.
        +1 @author 0m 0s The patch does not contain any @author tags.
        +1 test4tests 0m 0s The patch appears to include 1 new or modified test files.
        +1 mvninstall 6m 53s trunk passed
        +1 compile 0m 32s trunk passed
        +1 checkstyle 0m 21s trunk passed
        +1 mvnsite 0m 37s trunk passed
        +1 mvneclipse 0m 17s trunk passed
        +1 findbugs 0m 58s trunk passed
        +1 javadoc 0m 21s trunk passed
        +1 mvninstall 0m 31s the patch passed
        +1 compile 0m 29s the patch passed
        +1 javac 0m 29s the patch passed
        +1 checkstyle 0m 18s the patch passed
        +1 mvnsite 0m 36s the patch passed
        +1 mvneclipse 0m 14s the patch passed
        +1 whitespace 0m 0s The patch has no whitespace issues.
        +1 findbugs 1m 3s the patch passed
        +1 javadoc 0m 18s the patch passed
        +1 unit 36m 12s hadoop-yarn-server-resourcemanager in the patch passed.
        +1 asflicense 0m 16s The patch does not generate ASF License warnings.
        51m 28s



        Subsystem Report/Notes
        Docker Image:yetus/hadoop:9560f25
        JIRA Issue YARN-3848
        JIRA Patch URL https://issues.apache.org/jira/secure/attachment/12835537/YARN-3848.02.patch
        Optional Tests asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle
        uname Linux 093aca5d8a5e 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux
        Build tool maven
        Personality /testptch/hadoop/patchprocess/precommit/personality/provided.sh
        git revision trunk / 77f2684
        Default Java 1.8.0_101
        findbugs v3.0.0
        Test Results https://builds.apache.org/job/PreCommit-YARN-Build/13549/testReport/
        modules C: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager U: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager
        Console output https://builds.apache.org/job/PreCommit-YARN-Build/13549/console
        Powered by Apache Yetus 0.4.0-SNAPSHOT http://yetus.apache.org

        This message was automatically generated.

        Show
        hadoopqa Hadoop QA added a comment - +1 overall Vote Subsystem Runtime Comment 0 reexec 0m 17s Docker mode activated. +1 @author 0m 0s The patch does not contain any @author tags. +1 test4tests 0m 0s The patch appears to include 1 new or modified test files. +1 mvninstall 6m 53s trunk passed +1 compile 0m 32s trunk passed +1 checkstyle 0m 21s trunk passed +1 mvnsite 0m 37s trunk passed +1 mvneclipse 0m 17s trunk passed +1 findbugs 0m 58s trunk passed +1 javadoc 0m 21s trunk passed +1 mvninstall 0m 31s the patch passed +1 compile 0m 29s the patch passed +1 javac 0m 29s the patch passed +1 checkstyle 0m 18s the patch passed +1 mvnsite 0m 36s the patch passed +1 mvneclipse 0m 14s the patch passed +1 whitespace 0m 0s The patch has no whitespace issues. +1 findbugs 1m 3s the patch passed +1 javadoc 0m 18s the patch passed +1 unit 36m 12s hadoop-yarn-server-resourcemanager in the patch passed. +1 asflicense 0m 16s The patch does not generate ASF License warnings. 51m 28s Subsystem Report/Notes Docker Image:yetus/hadoop:9560f25 JIRA Issue YARN-3848 JIRA Patch URL https://issues.apache.org/jira/secure/attachment/12835537/YARN-3848.02.patch Optional Tests asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle uname Linux 093aca5d8a5e 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux Build tool maven Personality /testptch/hadoop/patchprocess/precommit/personality/provided.sh git revision trunk / 77f2684 Default Java 1.8.0_101 findbugs v3.0.0 Test Results https://builds.apache.org/job/PreCommit-YARN-Build/13549/testReport/ modules C: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager U: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager Console output https://builds.apache.org/job/PreCommit-YARN-Build/13549/console Powered by Apache Yetus 0.4.0-SNAPSHOT http://yetus.apache.org This message was automatically generated.
        Hide
        Naganarasimha Naganarasimha G R added a comment -

        Thanks for the patch Varun Saxena and review from Tan, Wangda, i have committed to trunk, 2.8 and 2.9.

        Show
        Naganarasimha Naganarasimha G R added a comment - Thanks for the patch Varun Saxena and review from Tan, Wangda , i have committed to trunk, 2.8 and 2.9.
        Hide
        hudson Hudson added a comment -

        FAILURE: Integrated in Jenkins build Hadoop-trunk-Commit #10702 (See https://builds.apache.org/job/Hadoop-trunk-Commit/10702/)
        YARN-3848. TestNodeLabelContainerAllocation is timing out. Contributed (naganarasimha_gr: rev 6c8830992c734313fc95ac9988539c4d813c3581)

        • (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/TestNodeLabelContainerAllocation.java
        Show
        hudson Hudson added a comment - FAILURE: Integrated in Jenkins build Hadoop-trunk-Commit #10702 (See https://builds.apache.org/job/Hadoop-trunk-Commit/10702/ ) YARN-3848 . TestNodeLabelContainerAllocation is timing out. Contributed (naganarasimha_gr: rev 6c8830992c734313fc95ac9988539c4d813c3581) (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/TestNodeLabelContainerAllocation.java

          People

          • Assignee:
            varun_saxena Varun Saxena
            Reporter:
            jlowe Jason Lowe
          • Votes:
            0 Vote for this issue
            Watchers:
            8 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development