Uploaded image for project: 'Hadoop YARN'
  1. Hadoop YARN
  2. YARN-5711

Propogate exceptions back to client when using hedging RM failover provider

Details

    • Reviewed

    Description

      When RM failsover, it does not auto re-register running apps and so they need to re-register when reconnecting to new primary. This is done by catching ApplicationMasterNotRegisteredException in allocate calls and re-registering. But RequestHedgingRMFailoverProxyProvider does not propagate YarnException as the actual invocation is done asynchronously using seperate threads, so AMs cannot reconnect to RM after failover.

      This JIRA proposes that the RequestHedgingRMFailoverProxyProvider propagate any YarnException that it encounters.

      Attachments

        1. YARN-5711.v1.1.patch
          10 kB
          Subramaniam Krishnan
        2. YARN-5711-v1.patch
          20 kB
          Subramaniam Krishnan
        3. YARN-5711-v2.patch
          9 kB
          Subramaniam Krishnan

        Issue Links

          Activity

            Looking at the code, one fix I can think of is to refactor the invoke method to an identify RM_IDs. Then the actual connection to the selected RM_ID (current primary) can be made directly using the main thread as is done presently.

            jianhe, thoughts/suggestions?

            subru Subramaniam Krishnan added a comment - Looking at the code, one fix I can think of is to refactor the invoke method to an identify RM_IDs . Then the actual connection to the selected RM_ID (current primary) can be made directly using the main thread as is done presently. jianhe , thoughts/suggestions?
            subru Subramaniam Krishnan added a comment - - edited

            Attaching a patch that returns any exception encountered with the active RM as discussed offline with jianhe. Thanks to ellenfkh for extensively testing this out in our cluster.

            FYI, there are some formatting fixes in RequestHedgingRMFailoverProxyProvider as it seems to follow intellij formatter rather than standard hadoop.

            subru Subramaniam Krishnan added a comment - - edited Attaching a patch that returns any exception encountered with the active RM as discussed offline with jianhe . Thanks to ellenfkh for extensively testing this out in our cluster. FYI, there are some formatting fixes in RequestHedgingRMFailoverProxyProvider as it seems to follow intellij formatter rather than standard hadoop.
            hadoopqa Hadoop QA added a comment -
            -1 overall



            Vote Subsystem Runtime Comment
            0 reexec 0m 0s Docker mode activated.
            -1 patch 0m 6s YARN-5711 does not apply to trunk. Rebase required? Wrong Branch? See https://wiki.apache.org/hadoop/HowToContribute for help.



            This message was automatically generated.

            hadoopqa Hadoop QA added a comment - -1 overall Vote Subsystem Runtime Comment 0 reexec 0m 0s Docker mode activated. -1 patch 0m 6s YARN-5711 does not apply to trunk. Rebase required? Wrong Branch? See https://wiki.apache.org/hadoop/HowToContribute for help. Subsystem Report/Notes JIRA Patch URL https://issues.apache.org/jira/secure/attachment/12834588/YARN-5711-v1.patch JIRA Issue YARN-5711 Console output https://builds.apache.org/job/PreCommit-YARN-Build/13461/console Powered by Apache Yetus 0.3.0 http://yetus.apache.org This message was automatically generated.
            subru Subramaniam Krishnan added a comment - - edited

            Re-uploading same patch (post dos2unix) to kick Yetus.

            subru Subramaniam Krishnan added a comment - - edited Re-uploading same patch (post dos2unix) to kick Yetus.
            hadoopqa Hadoop QA added a comment -
            -1 overall



            Vote Subsystem Runtime Comment
            0 reexec 0m 16s Docker mode activated.
            +1 @author 0m 0s The patch does not contain any @author tags.
            +1 test4tests 0m 0s The patch appears to include 1 new or modified test files.
            0 mvndep 0m 9s Maven dependency ordering for branch
            +1 mvninstall 6m 58s trunk passed
            +1 compile 2m 21s trunk passed
            +1 checkstyle 0m 38s trunk passed
            +1 mvnsite 0m 56s trunk passed
            +1 mvneclipse 0m 28s trunk passed
            +1 findbugs 1m 24s trunk passed
            +1 javadoc 0m 44s trunk passed
            0 mvndep 0m 10s Maven dependency ordering for patch
            +1 mvninstall 0m 48s the patch passed
            +1 compile 2m 21s the patch passed
            +1 javac 2m 21s the patch passed
            -1 checkstyle 0m 36s hadoop-yarn-project/hadoop-yarn: The patch generated 1 new + 1 unchanged - 1 fixed = 2 total (was 2)
            +1 mvnsite 0m 51s the patch passed
            +1 mvneclipse 0m 25s the patch passed
            +1 whitespace 0m 0s The patch has no whitespace issues.
            +1 findbugs 1m 37s the patch passed
            +1 javadoc 0m 39s the patch passed
            +1 unit 2m 20s hadoop-yarn-common in the patch passed.
            -1 unit 16m 10s hadoop-yarn-client in the patch failed.
            +1 asflicense 0m 17s The patch does not generate ASF License warnings.
            40m 54s



            Reason Tests
            Failed junit tests hadoop.yarn.client.cli.TestLogsCLI



            Subsystem Report/Notes
            Docker Image:yetus/hadoop:9560f25
            JIRA Patch URL https://issues.apache.org/jira/secure/attachment/12834725/YARN-5711.v1.1.patch
            JIRA Issue YARN-5711
            Optional Tests asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle
            uname Linux 024744986bd0 3.13.0-95-generic #142-Ubuntu SMP Fri Aug 12 17:00:09 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux
            Build tool maven
            Personality /testptch/hadoop/patchprocess/precommit/personality/provided.sh
            git revision trunk / f63cd78
            Default Java 1.8.0_101
            findbugs v3.0.0
            checkstyle https://builds.apache.org/job/PreCommit-YARN-Build/13470/artifact/patchprocess/diff-checkstyle-hadoop-yarn-project_hadoop-yarn.txt
            unit https://builds.apache.org/job/PreCommit-YARN-Build/13470/artifact/patchprocess/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-client.txt
            unit test logs https://builds.apache.org/job/PreCommit-YARN-Build/13470/artifact/patchprocess/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-client.txt
            Test Results https://builds.apache.org/job/PreCommit-YARN-Build/13470/testReport/
            modules C: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client U: hadoop-yarn-project/hadoop-yarn
            Console output https://builds.apache.org/job/PreCommit-YARN-Build/13470/console
            Powered by Apache Yetus 0.3.0 http://yetus.apache.org

            This message was automatically generated.

            hadoopqa Hadoop QA added a comment - -1 overall Vote Subsystem Runtime Comment 0 reexec 0m 16s Docker mode activated. +1 @author 0m 0s The patch does not contain any @author tags. +1 test4tests 0m 0s The patch appears to include 1 new or modified test files. 0 mvndep 0m 9s Maven dependency ordering for branch +1 mvninstall 6m 58s trunk passed +1 compile 2m 21s trunk passed +1 checkstyle 0m 38s trunk passed +1 mvnsite 0m 56s trunk passed +1 mvneclipse 0m 28s trunk passed +1 findbugs 1m 24s trunk passed +1 javadoc 0m 44s trunk passed 0 mvndep 0m 10s Maven dependency ordering for patch +1 mvninstall 0m 48s the patch passed +1 compile 2m 21s the patch passed +1 javac 2m 21s the patch passed -1 checkstyle 0m 36s hadoop-yarn-project/hadoop-yarn: The patch generated 1 new + 1 unchanged - 1 fixed = 2 total (was 2) +1 mvnsite 0m 51s the patch passed +1 mvneclipse 0m 25s the patch passed +1 whitespace 0m 0s The patch has no whitespace issues. +1 findbugs 1m 37s the patch passed +1 javadoc 0m 39s the patch passed +1 unit 2m 20s hadoop-yarn-common in the patch passed. -1 unit 16m 10s hadoop-yarn-client in the patch failed. +1 asflicense 0m 17s The patch does not generate ASF License warnings. 40m 54s Reason Tests Failed junit tests hadoop.yarn.client.cli.TestLogsCLI Subsystem Report/Notes Docker Image:yetus/hadoop:9560f25 JIRA Patch URL https://issues.apache.org/jira/secure/attachment/12834725/YARN-5711.v1.1.patch JIRA Issue YARN-5711 Optional Tests asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle uname Linux 024744986bd0 3.13.0-95-generic #142-Ubuntu SMP Fri Aug 12 17:00:09 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux Build tool maven Personality /testptch/hadoop/patchprocess/precommit/personality/provided.sh git revision trunk / f63cd78 Default Java 1.8.0_101 findbugs v3.0.0 checkstyle https://builds.apache.org/job/PreCommit-YARN-Build/13470/artifact/patchprocess/diff-checkstyle-hadoop-yarn-project_hadoop-yarn.txt unit https://builds.apache.org/job/PreCommit-YARN-Build/13470/artifact/patchprocess/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-client.txt unit test logs https://builds.apache.org/job/PreCommit-YARN-Build/13470/artifact/patchprocess/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-client.txt Test Results https://builds.apache.org/job/PreCommit-YARN-Build/13470/testReport/ modules C: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client U: hadoop-yarn-project/hadoop-yarn Console output https://builds.apache.org/job/PreCommit-YARN-Build/13470/console Powered by Apache Yetus 0.3.0 http://yetus.apache.org This message was automatically generated.
            jianhe Jian He added a comment -

            looks good, + 1

            jianhe Jian He added a comment - looks good, + 1

            Thanks jianhe for the quick review.

            I have updated the patch to fix the checkstyle issue, test case failure is unrelated. I'll wait for Yetus to come back & commit.

            subru Subramaniam Krishnan added a comment - Thanks jianhe for the quick review. I have updated the patch to fix the checkstyle issue, test case failure is unrelated. I'll wait for Yetus to come back & commit.
            hadoopqa Hadoop QA added a comment -
            -1 overall



            Vote Subsystem Runtime Comment
            0 reexec 0m 15s Docker mode activated.
            +1 @author 0m 0s The patch does not contain any @author tags.
            +1 test4tests 0m 0s The patch appears to include 1 new or modified test files.
            0 mvndep 0m 59s Maven dependency ordering for branch
            +1 mvninstall 8m 12s trunk passed
            +1 compile 2m 28s trunk passed
            +1 checkstyle 0m 39s trunk passed
            +1 mvnsite 0m 56s trunk passed
            +1 mvneclipse 0m 29s trunk passed
            +1 findbugs 1m 27s trunk passed
            +1 javadoc 0m 41s trunk passed
            0 mvndep 0m 9s Maven dependency ordering for patch
            +1 mvninstall 0m 43s the patch passed
            +1 compile 2m 18s the patch passed
            +1 javac 2m 18s the patch passed
            +1 checkstyle 0m 37s hadoop-yarn-project/hadoop-yarn: The patch generated 0 new + 1 unchanged - 1 fixed = 1 total (was 2)
            +1 mvnsite 0m 51s the patch passed
            +1 mvneclipse 0m 23s the patch passed
            +1 whitespace 0m 0s The patch has no whitespace issues.
            +1 findbugs 1m 37s the patch passed
            +1 javadoc 0m 38s the patch passed
            +1 unit 2m 19s hadoop-yarn-common in the patch passed.
            -1 unit 16m 14s hadoop-yarn-client in the patch failed.
            +1 asflicense 0m 19s The patch does not generate ASF License warnings.
            43m 0s



            Reason Tests
            Failed junit tests hadoop.yarn.client.cli.TestLogsCLI



            Subsystem Report/Notes
            Docker Image:yetus/hadoop:9560f25
            JIRA Patch URL https://issues.apache.org/jira/secure/attachment/12834751/YARN-5711-v2.patch
            JIRA Issue YARN-5711
            Optional Tests asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle
            uname Linux 2306cf340cd5 3.13.0-95-generic #142-Ubuntu SMP Fri Aug 12 17:00:09 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux
            Build tool maven
            Personality /testptch/hadoop/patchprocess/precommit/personality/provided.sh
            git revision trunk / 23d7d53
            Default Java 1.8.0_101
            findbugs v3.0.0
            unit https://builds.apache.org/job/PreCommit-YARN-Build/13474/artifact/patchprocess/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-client.txt
            unit test logs https://builds.apache.org/job/PreCommit-YARN-Build/13474/artifact/patchprocess/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-client.txt
            Test Results https://builds.apache.org/job/PreCommit-YARN-Build/13474/testReport/
            modules C: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client U: hadoop-yarn-project/hadoop-yarn
            Console output https://builds.apache.org/job/PreCommit-YARN-Build/13474/console
            Powered by Apache Yetus 0.3.0 http://yetus.apache.org

            This message was automatically generated.

            hadoopqa Hadoop QA added a comment - -1 overall Vote Subsystem Runtime Comment 0 reexec 0m 15s Docker mode activated. +1 @author 0m 0s The patch does not contain any @author tags. +1 test4tests 0m 0s The patch appears to include 1 new or modified test files. 0 mvndep 0m 59s Maven dependency ordering for branch +1 mvninstall 8m 12s trunk passed +1 compile 2m 28s trunk passed +1 checkstyle 0m 39s trunk passed +1 mvnsite 0m 56s trunk passed +1 mvneclipse 0m 29s trunk passed +1 findbugs 1m 27s trunk passed +1 javadoc 0m 41s trunk passed 0 mvndep 0m 9s Maven dependency ordering for patch +1 mvninstall 0m 43s the patch passed +1 compile 2m 18s the patch passed +1 javac 2m 18s the patch passed +1 checkstyle 0m 37s hadoop-yarn-project/hadoop-yarn: The patch generated 0 new + 1 unchanged - 1 fixed = 1 total (was 2) +1 mvnsite 0m 51s the patch passed +1 mvneclipse 0m 23s the patch passed +1 whitespace 0m 0s The patch has no whitespace issues. +1 findbugs 1m 37s the patch passed +1 javadoc 0m 38s the patch passed +1 unit 2m 19s hadoop-yarn-common in the patch passed. -1 unit 16m 14s hadoop-yarn-client in the patch failed. +1 asflicense 0m 19s The patch does not generate ASF License warnings. 43m 0s Reason Tests Failed junit tests hadoop.yarn.client.cli.TestLogsCLI Subsystem Report/Notes Docker Image:yetus/hadoop:9560f25 JIRA Patch URL https://issues.apache.org/jira/secure/attachment/12834751/YARN-5711-v2.patch JIRA Issue YARN-5711 Optional Tests asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle uname Linux 2306cf340cd5 3.13.0-95-generic #142-Ubuntu SMP Fri Aug 12 17:00:09 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux Build tool maven Personality /testptch/hadoop/patchprocess/precommit/personality/provided.sh git revision trunk / 23d7d53 Default Java 1.8.0_101 findbugs v3.0.0 unit https://builds.apache.org/job/PreCommit-YARN-Build/13474/artifact/patchprocess/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-client.txt unit test logs https://builds.apache.org/job/PreCommit-YARN-Build/13474/artifact/patchprocess/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-client.txt Test Results https://builds.apache.org/job/PreCommit-YARN-Build/13474/testReport/ modules C: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client U: hadoop-yarn-project/hadoop-yarn Console output https://builds.apache.org/job/PreCommit-YARN-Build/13474/console Powered by Apache Yetus 0.3.0 http://yetus.apache.org This message was automatically generated.

            I just committed this to trunk/branch-2. Thanks jianhe and ellenfkh.

            subru Subramaniam Krishnan added a comment - I just committed this to trunk/branch-2. Thanks jianhe and ellenfkh .
            hudson Hudson added a comment -

            SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #10669 (See https://builds.apache.org/job/Hadoop-trunk-Commit/10669/)
            YARN-5711. Propogate exceptions back to client when using hedging RM (subru: rev 0a166b13472213db0a0cd2dfdaddb2b1746b3957)

            • (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/client/RequestHedgingRMFailoverProxyProvider.java
            • (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client/src/test/java/org/apache/hadoop/yarn/client/TestHedgingRequestRMFailoverProxyProvider.java
            hudson Hudson added a comment - SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #10669 (See https://builds.apache.org/job/Hadoop-trunk-Commit/10669/ ) YARN-5711 . Propogate exceptions back to client when using hedging RM (subru: rev 0a166b13472213db0a0cd2dfdaddb2b1746b3957) (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/client/RequestHedgingRMFailoverProxyProvider.java (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client/src/test/java/org/apache/hadoop/yarn/client/TestHedgingRequestRMFailoverProxyProvider.java

            People

              subru Subramaniam Krishnan
              subru Subramaniam Krishnan
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: