Uploaded image for project: 'Hadoop YARN'
  1. Hadoop YARN
  2. YARN-3533

Test: Fix launchAM in MockRM to wait for attempt to be scheduled

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: 2.6.0
    • Fix Version/s: 2.8.0, 3.0.0-alpha1
    • Component/s: yarn
    • Labels:
      None
    • Hadoop Flags:
      Reviewed

      Description

      MockRM#launchAM fails in many test runs because it does not wait for the app attempt to be scheduled before NM update is sent as noted in recent builds

        Activity

        Hide
        hudson Hudson added a comment -

        SUCCESS: Integrated in Hadoop-Mapreduce-trunk #2129 (See https://builds.apache.org/job/Hadoop-Mapreduce-trunk/2129/)
        YARN-3533. Test: Fix launchAM in MockRM to wait for attempt to be scheduled. Contributed by Anubhav Dhoot (jianhe: rev 4c1af156aef4f3bb1d9823d5980c59b12007dc77)

        • hadoop-yarn-project/CHANGES.txt
        • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/MockRM.java
        Show
        hudson Hudson added a comment - SUCCESS: Integrated in Hadoop-Mapreduce-trunk #2129 (See https://builds.apache.org/job/Hadoop-Mapreduce-trunk/2129/ ) YARN-3533 . Test: Fix launchAM in MockRM to wait for attempt to be scheduled. Contributed by Anubhav Dhoot (jianhe: rev 4c1af156aef4f3bb1d9823d5980c59b12007dc77) hadoop-yarn-project/CHANGES.txt hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/MockRM.java
        Hide
        hudson Hudson added a comment -

        FAILURE: Integrated in Hadoop-Mapreduce-trunk-Java8 #180 (See https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Java8/180/)
        YARN-3533. Test: Fix launchAM in MockRM to wait for attempt to be scheduled. Contributed by Anubhav Dhoot (jianhe: rev 4c1af156aef4f3bb1d9823d5980c59b12007dc77)

        • hadoop-yarn-project/CHANGES.txt
        • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/MockRM.java
        Show
        hudson Hudson added a comment - FAILURE: Integrated in Hadoop-Mapreduce-trunk-Java8 #180 (See https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Java8/180/ ) YARN-3533 . Test: Fix launchAM in MockRM to wait for attempt to be scheduled. Contributed by Anubhav Dhoot (jianhe: rev 4c1af156aef4f3bb1d9823d5980c59b12007dc77) hadoop-yarn-project/CHANGES.txt hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/MockRM.java
        Hide
        hudson Hudson added a comment -

        FAILURE: Integrated in Hadoop-Yarn-trunk #913 (See https://builds.apache.org/job/Hadoop-Yarn-trunk/913/)
        YARN-3533. Test: Fix launchAM in MockRM to wait for attempt to be scheduled. Contributed by Anubhav Dhoot (jianhe: rev 4c1af156aef4f3bb1d9823d5980c59b12007dc77)

        • hadoop-yarn-project/CHANGES.txt
        • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/MockRM.java
        Show
        hudson Hudson added a comment - FAILURE: Integrated in Hadoop-Yarn-trunk #913 (See https://builds.apache.org/job/Hadoop-Yarn-trunk/913/ ) YARN-3533 . Test: Fix launchAM in MockRM to wait for attempt to be scheduled. Contributed by Anubhav Dhoot (jianhe: rev 4c1af156aef4f3bb1d9823d5980c59b12007dc77) hadoop-yarn-project/CHANGES.txt hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/MockRM.java
        Hide
        hudson Hudson added a comment -

        FAILURE: Integrated in Hadoop-Hdfs-trunk-Java8 #170 (See https://builds.apache.org/job/Hadoop-Hdfs-trunk-Java8/170/)
        YARN-3533. Test: Fix launchAM in MockRM to wait for attempt to be scheduled. Contributed by Anubhav Dhoot (jianhe: rev 4c1af156aef4f3bb1d9823d5980c59b12007dc77)

        • hadoop-yarn-project/CHANGES.txt
        • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/MockRM.java
        Show
        hudson Hudson added a comment - FAILURE: Integrated in Hadoop-Hdfs-trunk-Java8 #170 (See https://builds.apache.org/job/Hadoop-Hdfs-trunk-Java8/170/ ) YARN-3533 . Test: Fix launchAM in MockRM to wait for attempt to be scheduled. Contributed by Anubhav Dhoot (jianhe: rev 4c1af156aef4f3bb1d9823d5980c59b12007dc77) hadoop-yarn-project/CHANGES.txt hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/MockRM.java
        Hide
        hudson Hudson added a comment -

        FAILURE: Integrated in Hadoop-Yarn-trunk-Java8 #179 (See https://builds.apache.org/job/Hadoop-Yarn-trunk-Java8/179/)
        YARN-3533. Test: Fix launchAM in MockRM to wait for attempt to be scheduled. Contributed by Anubhav Dhoot (jianhe: rev 4c1af156aef4f3bb1d9823d5980c59b12007dc77)

        • hadoop-yarn-project/CHANGES.txt
        • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/MockRM.java
        Show
        hudson Hudson added a comment - FAILURE: Integrated in Hadoop-Yarn-trunk-Java8 #179 (See https://builds.apache.org/job/Hadoop-Yarn-trunk-Java8/179/ ) YARN-3533 . Test: Fix launchAM in MockRM to wait for attempt to be scheduled. Contributed by Anubhav Dhoot (jianhe: rev 4c1af156aef4f3bb1d9823d5980c59b12007dc77) hadoop-yarn-project/CHANGES.txt hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/MockRM.java
        Hide
        hudson Hudson added a comment -

        FAILURE: Integrated in Hadoop-Hdfs-trunk #2111 (See https://builds.apache.org/job/Hadoop-Hdfs-trunk/2111/)
        YARN-3533. Test: Fix launchAM in MockRM to wait for attempt to be scheduled. Contributed by Anubhav Dhoot (jianhe: rev 4c1af156aef4f3bb1d9823d5980c59b12007dc77)

        • hadoop-yarn-project/CHANGES.txt
        • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/MockRM.java
        Show
        hudson Hudson added a comment - FAILURE: Integrated in Hadoop-Hdfs-trunk #2111 (See https://builds.apache.org/job/Hadoop-Hdfs-trunk/2111/ ) YARN-3533 . Test: Fix launchAM in MockRM to wait for attempt to be scheduled. Contributed by Anubhav Dhoot (jianhe: rev 4c1af156aef4f3bb1d9823d5980c59b12007dc77) hadoop-yarn-project/CHANGES.txt hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/MockRM.java
        Hide
        jianhe Jian He added a comment -

        Committed to trunk and branch-2, thanks Anubhav !

        Show
        jianhe Jian He added a comment - Committed to trunk and branch-2, thanks Anubhav !
        Hide
        hudson Hudson added a comment -

        SUCCESS: Integrated in Hadoop-trunk-Commit #7702 (See https://builds.apache.org/job/Hadoop-trunk-Commit/7702/)
        YARN-3533. Test: Fix launchAM in MockRM to wait for attempt to be scheduled. Contributed by Anubhav Dhoot (jianhe: rev 4c1af156aef4f3bb1d9823d5980c59b12007dc77)

        • hadoop-yarn-project/CHANGES.txt
        • hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/MockRM.java
        Show
        hudson Hudson added a comment - SUCCESS: Integrated in Hadoop-trunk-Commit #7702 (See https://builds.apache.org/job/Hadoop-trunk-Commit/7702/ ) YARN-3533 . Test: Fix launchAM in MockRM to wait for attempt to be scheduled. Contributed by Anubhav Dhoot (jianhe: rev 4c1af156aef4f3bb1d9823d5980c59b12007dc77) hadoop-yarn-project/CHANGES.txt hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/MockRM.java
        Hide
        jianhe Jian He added a comment -

        committed to trunk and branch-2, thanks Anubhav !
        Thanks sandflee, Rohith Sharma K S for the review !

        Show
        jianhe Jian He added a comment - committed to trunk and branch-2, thanks Anubhav ! Thanks sandflee , Rohith Sharma K S for the review !
        Hide
        jianhe Jian He added a comment -

        getApplicationAttempt seems confusing, I just opened https://issues.apache.org/jira/browse/YARN-3546 to discuss this

        I replied on the jira.

        The TestContainerAllocation failure is unrelated to this patch. opening a new jira to fix that.

        committing this.

        Show
        jianhe Jian He added a comment - getApplicationAttempt seems confusing, I just opened https://issues.apache.org/jira/browse/YARN-3546 to discuss this I replied on the jira. The TestContainerAllocation failure is unrelated to this patch. opening a new jira to fix that. committing this.
        Hide
        jianhe Jian He added a comment -

        patch looks good to me, thanks Anubhav Dhoot !
        hopefully this can resolve some intermittent failures we've seen recently.

        Show
        jianhe Jian He added a comment - patch looks good to me, thanks Anubhav Dhoot ! hopefully this can resolve some intermittent failures we've seen recently.
        Hide
        sandflee sandflee added a comment -

        getApplicationAttempt seems confusing, I just opened https://issues.apache.org/jira/browse/YARN-3546 to discuss this

        Show
        sandflee sandflee added a comment - getApplicationAttempt seems confusing, I just opened https://issues.apache.org/jira/browse/YARN-3546 to discuss this
        Hide
        hadoopqa Hadoop QA added a comment -



        -1 overall



        Vote Subsystem Runtime Comment
        0 pre-patch 5m 12s Pre-patch trunk compilation is healthy.
        +1 @author 0m 0s The patch does not contain any @author tags.
        +1 tests included 0m 0s The patch appears to include 1 new or modified test files.
        +1 whitespace 0m 0s The patch has no lines that end in whitespace.
        +1 javac 7m 31s There were no new javac warning messages.
        +1 release audit 0m 20s The applied patch does not increase the total number of release audit warnings.
        +1 checkstyle 5m 26s There were no new checkstyle issues.
        +1 install 1m 32s mvn install still works.
        +1 eclipse:eclipse 0m 31s The patch built with eclipse:eclipse.
        +1 findbugs 1m 14s The patch does not introduce any new Findbugs (version 2.0.3) warnings.
        -1 yarn tests 52m 47s Tests failed in hadoop-yarn-server-resourcemanager.
            74m 35s  



        Reason Tests
        Failed unit tests hadoop.yarn.server.resourcemanager.scheduler.capacity.TestContainerAllocation



        Subsystem Report/Notes
        Patch URL http://issues.apache.org/jira/secure/attachment/12727472/YARN-3533.001.patch
        Optional Tests javac unit findbugs checkstyle
        git revision trunk / a100be6
        hadoop-yarn-server-resourcemanager test log https://builds.apache.org/job/PreCommit-YARN-Build/7467/artifact/patchprocess/testrun_hadoop-yarn-server-resourcemanager.txt
        Test Results https://builds.apache.org/job/PreCommit-YARN-Build/7467/testReport/
        Console output https://builds.apache.org/job/PreCommit-YARN-Build/7467//console

        This message was automatically generated.

        Show
        hadoopqa Hadoop QA added a comment - -1 overall Vote Subsystem Runtime Comment 0 pre-patch 5m 12s Pre-patch trunk compilation is healthy. +1 @author 0m 0s The patch does not contain any @author tags. +1 tests included 0m 0s The patch appears to include 1 new or modified test files. +1 whitespace 0m 0s The patch has no lines that end in whitespace. +1 javac 7m 31s There were no new javac warning messages. +1 release audit 0m 20s The applied patch does not increase the total number of release audit warnings. +1 checkstyle 5m 26s There were no new checkstyle issues. +1 install 1m 32s mvn install still works. +1 eclipse:eclipse 0m 31s The patch built with eclipse:eclipse. +1 findbugs 1m 14s The patch does not introduce any new Findbugs (version 2.0.3) warnings. -1 yarn tests 52m 47s Tests failed in hadoop-yarn-server-resourcemanager.     74m 35s   Reason Tests Failed unit tests hadoop.yarn.server.resourcemanager.scheduler.capacity.TestContainerAllocation Subsystem Report/Notes Patch URL http://issues.apache.org/jira/secure/attachment/12727472/YARN-3533.001.patch Optional Tests javac unit findbugs checkstyle git revision trunk / a100be6 hadoop-yarn-server-resourcemanager test log https://builds.apache.org/job/PreCommit-YARN-Build/7467/artifact/patchprocess/testrun_hadoop-yarn-server-resourcemanager.txt Test Results https://builds.apache.org/job/PreCommit-YARN-Build/7467/testReport/ Console output https://builds.apache.org/job/PreCommit-YARN-Build/7467//console This message was automatically generated.
        Hide
        sandflee sandflee added a comment -

        thanks for you patch,
        1, waitForSchedulerAppAttemptAdded may be not done as expected
        in waitForSchedulerAppAttemptAdded.
        public T getApplicationAttempt(ApplicationAttemptId applicationAttemptId)

        { SchedulerApplication<T> app = applications.get(applicationAttemptId.getApplicationId()); return app == null ? null : app.getCurrentAppAttempt(); }

        as above shows, this func just get the current appAttempt not the appAttempt correspongding to applicationAttemptId. (A BUG?)

        2, SCHEDULED is not a stable state, is it possible other nm heartbeat makes it becomes allocated, wait for this state will be blocked?

        Show
        sandflee sandflee added a comment - thanks for you patch, 1, waitForSchedulerAppAttemptAdded may be not done as expected in waitForSchedulerAppAttemptAdded. public T getApplicationAttempt(ApplicationAttemptId applicationAttemptId) { SchedulerApplication<T> app = applications.get(applicationAttemptId.getApplicationId()); return app == null ? null : app.getCurrentAppAttempt(); } as above shows, this func just get the current appAttempt not the appAttempt correspongding to applicationAttemptId. (A BUG?) 2, SCHEDULED is not a stable state, is it possible other nm heartbeat makes it becomes allocated, wait for this state will be blocked?
        Hide
        rohithsharma Rohith Sharma K S added a comment -

        Submitting patch on behalf of Anubhav for kick off Jenkins

        Show
        rohithsharma Rohith Sharma K S added a comment - Submitting patch on behalf of Anubhav for kick off Jenkins
        Hide
        rohithsharma Rohith Sharma K S added a comment -

        +1(non-binding) LGTM ..

        Show
        rohithsharma Rohith Sharma K S added a comment - +1(non-binding) LGTM ..
        Hide
        adhoot Anubhav Dhoot added a comment -

        Fix that adds an explicit wait for state and fixes other waits to throw when timedout

        Show
        adhoot Anubhav Dhoot added a comment - Fix that adds an explicit wait for state and fixes other waits to throw when timedout

          People

          • Assignee:
            adhoot Anubhav Dhoot
            Reporter:
            adhoot Anubhav Dhoot
          • Votes:
            0 Vote for this issue
            Watchers:
            6 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development