Uploaded image for project: 'Hadoop YARN'
  1. Hadoop YARN
  2. YARN-9798

ApplicationMasterServiceTestBase#testRepeatedFinishApplicationMaster fails intermittently

    XMLWordPrintableJSON

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Minor
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 3.3.0
    • Component/s: test
    • Labels:
      None

      Description

      Found intermittent failure of ApplicationMasterServiceTestBase#testRepeatedFinishApplicationMaster in YARN-9714 jenkins report, the cause is that the assertion which will make sure dispatcher has handled UNREGISTERED event but not wait until all events in dispatcher are handled, we need to add rm.drainEvents() before that assertion to fix this issue.

      Failure info:

      [ERROR] testRepeatedFinishApplicationMaster(org.apache.hadoop.yarn.server.resourcemanager.TestApplicationMasterServiceCapacity)  Time elapsed: 0.559 s  <<< FAILURE!
      java.lang.AssertionError: Expecting only one event expected:<1> but was:<0>
      	at org.junit.Assert.fail(Assert.java:88)
      	at org.junit.Assert.failNotEquals(Assert.java:834)
      	at org.junit.Assert.assertEquals(Assert.java:645)
      	at org.apache.hadoop.yarn.server.resourcemanager.ApplicationMasterServiceTestBase.testRepeatedFinishApplicationMaster(ApplicationMasterServiceTestBase.java:385)
      	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
      	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
      	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
      	at java.lang.reflect.Method.invoke(Method.java:498)
      	at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
      	at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
      	at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
      	at org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
      	at org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:298)
      	at org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:292)
      	at java.util.concurrent.FutureTask.run(FutureTask.java:266)
      	at java.lang.Thread.run(Thread.java:748)
      

      Standard output:

      2019-08-29 06:59:54,458 ERROR [AsyncDispatcher event handler] resourcemanager.ResourceManager (ResourceManager.java:handle(1088)) - Error in handling event type REGISTERED for applicationAttempt appattempt_1567061994047_0001_000001
      org.apache.hadoop.yarn.exceptions.YarnRuntimeException: java.lang.InterruptedException
      	at org.apache.hadoop.yarn.event.AsyncDispatcher$GenericEventHandler.handle(AsyncDispatcher.java:276)
      	at org.apache.hadoop.yarn.event.DrainDispatcher$2.handle(DrainDispatcher.java:91)
      	at org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl$AMRegisteredTransition.transition(RMAppAttemptImpl.java:1679)
      	at org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl$AMRegisteredTransition.transition(RMAppAttemptImpl.java:1658)
      	at org.apache.hadoop.yarn.state.StateMachineFactory$SingleInternalArc.doTransition(StateMachineFactory.java:362)
      	at org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:302)
      	at org.apache.hadoop.yarn.state.StateMachineFactory.access$500(StateMachineFactory.java:46)
      	at org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:487)
      	at org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl.handle(RMAppAttemptImpl.java:914)
      	at org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl.handle(RMAppAttemptImpl.java:121)
      	at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$ApplicationAttemptEventDispatcher.handle(ResourceManager.java:1086)
      	at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$ApplicationAttemptEventDispatcher.handle(ResourceManager.java:1067)
      	at org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:200)
      	at org.apache.hadoop.yarn.server.resourcemanager.ApplicationMasterServiceTestBase$CountingDispatcher.dispatch(ApplicationMasterServiceTestBase.java:401)
      	at org.apache.hadoop.yarn.event.DrainDispatcher$1.run(DrainDispatcher.java:76)
      	at java.lang.Thread.run(Thread.java:748)
      Caused by: java.lang.InterruptedException
      	at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireInterruptibly(AbstractQueuedSynchronizer.java:1220)
      	at java.util.concurrent.locks.ReentrantLock.lockInterruptibly(ReentrantLock.java:335)
      	at java.util.concurrent.LinkedBlockingQueue.put(LinkedBlockingQueue.java:339)
      	at org.apache.hadoop.yarn.event.AsyncDispatcher$GenericEventHandler.handle(AsyncDispatcher.java:268)
      	... 15 more
      

        Attachments

        1. YARN-9798.001.patch
          1 kB
          Tao Yang

          Activity

            People

            • Assignee:
              Tao Yang Tao Yang
              Reporter:
              Tao Yang Tao Yang
            • Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: