Uploaded image for project: 'Hadoop YARN'
  1. Hadoop YARN
  2. YARN-7020

TestAMRMProxy#testAMRMProxyTokenRenewal is flakey

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: 3.0.0-beta1
    • Fix Version/s: 2.9.0, 3.0.0-beta1, 2.8.2
    • Component/s: None
    • Labels:
      None

      Description

      TestAMRMProxy#testAMRMProxyTokenRenewal is flakey. It infrequently fails with:

      testAMRMProxyTokenRenewal(org.apache.hadoop.yarn.client.api.impl.TestAMRMProxy)  Time elapsed: 19.036 sec  <<< ERROR!
      org.apache.hadoop.yarn.exceptions.ApplicationAttemptNotFoundException: Application attempt appattempt_1502837054903_0001_000001 doesn't exist in ApplicationMasterService cache.
      	at org.apache.hadoop.yarn.server.resourcemanager.ApplicationMasterService.allocate(ApplicationMasterService.java:355)
      	at org.apache.hadoop.yarn.server.nodemanager.amrmproxy.DefaultRequestInterceptor$3.allocate(DefaultRequestInterceptor.java:224)
      	at org.apache.hadoop.yarn.server.nodemanager.amrmproxy.DefaultRequestInterceptor.allocate(DefaultRequestInterceptor.java:135)
      	at org.apache.hadoop.yarn.server.nodemanager.amrmproxy.AMRMProxyService.allocate(AMRMProxyService.java:279)
      	at org.apache.hadoop.yarn.api.impl.pb.service.ApplicationMasterProtocolPBServiceImpl.allocate(ApplicationMasterProtocolPBServiceImpl.java:60)
      	at org.apache.hadoop.yarn.proto.ApplicationMasterProtocol$ApplicationMasterProtocolService$2.callBlockingMethod(ApplicationMasterProtocol.java:99)
      	at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:523)
      	at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:991)
      	at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:869)
      	at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:815)
      	at java.security.AccessController.doPrivileged(Native Method)
      	at javax.security.auth.Subject.doAs(Subject.java:422)
      	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1962)
      	at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2675)
      
      	at org.apache.hadoop.ipc.Client.getRpcResponse(Client.java:1490)
      	at org.apache.hadoop.ipc.Client.call(Client.java:1436)
      	at org.apache.hadoop.ipc.Client.call(Client.java:1346)
      	at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:228)
      	at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:116)
      	at com.sun.proxy.$Proxy90.allocate(Unknown Source)
      	at org.apache.hadoop.yarn.api.impl.pb.client.ApplicationMasterProtocolPBClientImpl.allocate(ApplicationMasterProtocolPBClientImpl.java:77)
      	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
      	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
      	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
      	at java.lang.reflect.Method.invoke(Method.java:498)
      	at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:422)
      	at org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeMethod(RetryInvocationHandler.java:165)
      	at org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invoke(RetryInvocationHandler.java:157)
      	at org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeOnce(RetryInvocationHandler.java:95)
      	at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:359)
      	at com.sun.proxy.$Proxy91.allocate(Unknown Source)
      	at org.apache.hadoop.yarn.client.api.impl.TestAMRMProxy.testAMRMProxyTokenRenewal(TestAMRMProxy.java:190)
      

        Activity

        Hide
        rkanter Robert Kanter added a comment -

        Thanks Jason!

        Show
        rkanter Robert Kanter added a comment - Thanks Jason!
        Hide
        hudson Hudson added a comment -

        SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #12196 (See https://builds.apache.org/job/Hadoop-trunk-Commit/12196/)
        YARN-7020. TestAMRMProxy#testAMRMProxyTokenRenewal is flakey. (jlowe: rev 14553061be0a341df3e628dcaf06717b4630b05e)

        • (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client/src/test/java/org/apache/hadoop/yarn/client/api/impl/TestAMRMProxy.java
        Show
        hudson Hudson added a comment - SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #12196 (See https://builds.apache.org/job/Hadoop-trunk-Commit/12196/ ) YARN-7020 . TestAMRMProxy#testAMRMProxyTokenRenewal is flakey. (jlowe: rev 14553061be0a341df3e628dcaf06717b4630b05e) (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client/src/test/java/org/apache/hadoop/yarn/client/api/impl/TestAMRMProxy.java
        Hide
        jlowe Jason Lowe added a comment -

        Thanks, Robert! I committed this to trunk, branch-2, branch-2.8, and branch-2.8.2.

        Show
        jlowe Jason Lowe added a comment - Thanks, Robert! I committed this to trunk, branch-2, branch-2.8, and branch-2.8.2.
        Hide
        jlowe Jason Lowe added a comment -

        +1 lgtm. Committing this.

        Show
        jlowe Jason Lowe added a comment - +1 lgtm. Committing this.
        Hide
        rkanter Robert Kanter added a comment -

        The test failure is unrelated (YARN-6272).

        Show
        rkanter Robert Kanter added a comment - The test failure is unrelated ( YARN-6272 ).
        Hide
        hadoopqa Hadoop QA added a comment -
        -1 overall



        Vote Subsystem Runtime Comment
        0 reexec 0m 24s Docker mode activated.
              Prechecks
        +1 @author 0m 0s The patch does not contain any @author tags.
        +1 test4tests 0m 0s The patch appears to include 1 new or modified test files.
              trunk Compile Tests
        +1 mvninstall 15m 45s trunk passed
        +1 compile 0m 23s trunk passed
        +1 checkstyle 0m 15s trunk passed
        +1 mvnsite 0m 24s trunk passed
        +1 findbugs 0m 34s trunk passed
        +1 javadoc 0m 15s trunk passed
              Patch Compile Tests
        +1 mvninstall 0m 21s the patch passed
        +1 compile 0m 20s the patch passed
        +1 javac 0m 20s the patch passed
        +1 checkstyle 0m 12s the patch passed
        +1 mvnsite 0m 22s the patch passed
        +1 whitespace 0m 0s The patch has no whitespace issues.
        +1 findbugs 0m 39s the patch passed
        +1 javadoc 0m 13s the patch passed
              Other Tests
        -1 unit 21m 20s hadoop-yarn-client in the patch failed.
        +1 asflicense 0m 15s The patch does not generate ASF License warnings.
        43m 3s



        Reason Tests
        Failed junit tests hadoop.yarn.client.api.impl.TestAMRMClient



        Subsystem Report/Notes
        Docker Image:yetus/hadoop:14b5c93
        JIRA Issue YARN-7020
        JIRA Patch URL https://issues.apache.org/jira/secure/attachment/12882042/YARN-7020.001.patch
        Optional Tests asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle
        uname Linux 862dbbe5fce6 3.13.0-116-generic #163-Ubuntu SMP Fri Mar 31 14:13:22 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux
        Build tool maven
        Personality /testptch/hadoop/patchprocess/precommit/personality/provided.sh
        git revision trunk / f34646d
        Default Java 1.8.0_144
        findbugs v3.1.0-RC1
        unit https://builds.apache.org/job/PreCommit-YARN-Build/16920/artifact/patchprocess/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-client.txt
        Test Results https://builds.apache.org/job/PreCommit-YARN-Build/16920/testReport/
        modules C: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client U: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client
        Console output https://builds.apache.org/job/PreCommit-YARN-Build/16920/console
        Powered by Apache Yetus 0.6.0-SNAPSHOT http://yetus.apache.org

        This message was automatically generated.

        Show
        hadoopqa Hadoop QA added a comment - -1 overall Vote Subsystem Runtime Comment 0 reexec 0m 24s Docker mode activated.       Prechecks +1 @author 0m 0s The patch does not contain any @author tags. +1 test4tests 0m 0s The patch appears to include 1 new or modified test files.       trunk Compile Tests +1 mvninstall 15m 45s trunk passed +1 compile 0m 23s trunk passed +1 checkstyle 0m 15s trunk passed +1 mvnsite 0m 24s trunk passed +1 findbugs 0m 34s trunk passed +1 javadoc 0m 15s trunk passed       Patch Compile Tests +1 mvninstall 0m 21s the patch passed +1 compile 0m 20s the patch passed +1 javac 0m 20s the patch passed +1 checkstyle 0m 12s the patch passed +1 mvnsite 0m 22s the patch passed +1 whitespace 0m 0s The patch has no whitespace issues. +1 findbugs 0m 39s the patch passed +1 javadoc 0m 13s the patch passed       Other Tests -1 unit 21m 20s hadoop-yarn-client in the patch failed. +1 asflicense 0m 15s The patch does not generate ASF License warnings. 43m 3s Reason Tests Failed junit tests hadoop.yarn.client.api.impl.TestAMRMClient Subsystem Report/Notes Docker Image:yetus/hadoop:14b5c93 JIRA Issue YARN-7020 JIRA Patch URL https://issues.apache.org/jira/secure/attachment/12882042/YARN-7020.001.patch Optional Tests asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle uname Linux 862dbbe5fce6 3.13.0-116-generic #163-Ubuntu SMP Fri Mar 31 14:13:22 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux Build tool maven Personality /testptch/hadoop/patchprocess/precommit/personality/provided.sh git revision trunk / f34646d Default Java 1.8.0_144 findbugs v3.1.0-RC1 unit https://builds.apache.org/job/PreCommit-YARN-Build/16920/artifact/patchprocess/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-client.txt Test Results https://builds.apache.org/job/PreCommit-YARN-Build/16920/testReport/ modules C: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client U: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client Console output https://builds.apache.org/job/PreCommit-YARN-Build/16920/console Powered by Apache Yetus 0.6.0-SNAPSHOT http://yetus.apache.org This message was automatically generated.
        Hide
        rkanter Robert Kanter added a comment -

        This is due to a timing issue. The test sets a number of configs to 1.5 second intervals, including yarn.am.liveness-monitor.expiry-interval-ms. And when the expired event happens in RMAppAttemptImpl, it removes the app attempt from the cache; then if the ApplicationMasterService tries to read it from the cache afterwards, it can't find it and you get the error.

        I'm open to ideas on how to remove the timing element to this test, but for now I've upped the numbers to make it more reliable. In my testing, the original values could only accommodate a 1 second delay in ApplicationMasterService#allocate, but with my changes, it can accommodate a 4 second delay. This makes the test much more reliable.

        Show
        rkanter Robert Kanter added a comment - This is due to a timing issue. The test sets a number of configs to 1.5 second intervals, including yarn.am.liveness-monitor.expiry-interval-ms . And when the expired event happens in RMAppAttemptImpl , it removes the app attempt from the cache; then if the ApplicationMasterService tries to read it from the cache afterwards, it can't find it and you get the error. I'm open to ideas on how to remove the timing element to this test, but for now I've upped the numbers to make it more reliable. In my testing, the original values could only accommodate a 1 second delay in ApplicationMasterService#allocate , but with my changes, it can accommodate a 4 second delay. This makes the test much more reliable.

          People

          • Assignee:
            rkanter Robert Kanter
            Reporter:
            rkanter Robert Kanter
          • Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development