Uploaded image for project: 'Hadoop YARN'
  1. Hadoop YARN
  2. YARN-2649

Flaky test TestAMRMRPCNodeUpdates

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Closed
    • Major
    • Resolution: Fixed
    • None
    • 2.6.0
    • None
    • None
    • Reviewed

    Description

      Sometimes the test fails with the following error:

      testAMRMUnusableNodes(org.apache.hadoop.yarn.server.resourcemanager.applicationsmanager.TestAMRMRPCNodeUpdates) Time elapsed: 41.73 sec <<< FAILURE!
      junit.framework.AssertionFailedError: AppAttempt state is not correct (timedout) expected:<ALLOCATED> but was:<SCHEDULED>
      at junit.framework.Assert.fail(Assert.java:50)
      at junit.framework.Assert.failNotEquals(Assert.java:287)
      at junit.framework.Assert.assertEquals(Assert.java:67)
      at org.apache.hadoop.yarn.server.resourcemanager.MockAM.waitForState(MockAM.java:82)
      at org.apache.hadoop.yarn.server.resourcemanager.MockRM.sendAMLaunched(MockRM.java:382)
      at org.apache.hadoop.yarn.server.resourcemanager.applicationsmanager.TestAMRMRPCNodeUpdates.testAMRMUnusableNodes(TestAMRMRPCNodeUpdates.java:125)

      When this happens, SchedulerEventType.NODE_UPDATE was processed before RMAppAttemptEvent.ATTEMPT_ADDED was processed. That is possible, given the test only waits for RMAppState.ACCEPTED before having NM sending heartbeat. This can be reproduced using custom AsyncDispatcher with CountDownLatch. Here is the log when this happens.

      App State is : ACCEPTED
      2014-10-05 21:25:07,305 INFO  [AsyncDispatcher event handler] attempt.RMAppAttemptImpl (RMAppAttemptImpl.java:handle(670)) - appattempt_1412569506932_0001_000001 State change from NEW to SUBMITTED
      2014-10-05 21:25:07,305 DEBUG [AsyncDispatcher event handler] event.AsyncDispatcher (AsyncDispatcher.java:dispatch(164)) - Dispatching the event org.apache.hadoop.yarn.server.resourcemanager.rmnode.RMNodeStatusEvent.EventType: STATUS_UPDATE
      2014-10-05 21:25:07,305 DEBUG [AsyncDispatcher event handler] rmnode.RMNodeImpl (RMNodeImpl.java:handle(384)) - Processing 127.0.0.1:1234 of type STATUS_UPDATE
      AppAttempt : appattempt_1412569506932_0001_000001 State is : SUBMITTED Waiting for state : ALLOCATED
      2014-10-05 21:25:07,306 DEBUG [AsyncDispatcher event handler] event.AsyncDispatcher (AsyncDispatcher.java:dispatch(164)) - Dispatching the event org.apache.hadoop.yarn.server.resourcemanager.scheduler.event.AppAttemptAddedSchedulerEvent.EventType: APP_ATTEMPT_ADDED
      
      2014-10-05 21:25:07,328 DEBUG [AsyncDispatcher event handler] event.AsyncDispatcher (AsyncDispatcher.java:dispatch(164)) - Dispatching the event org.apache.hadoop.yarn.server.resourcemanager.scheduler.event.NodeUpdateSchedulerEvent.EventType: NODE_UPDATE
      
      2014-10-05 21:25:07,330 DEBUG [AsyncDispatcher event handler] event.AsyncDispatcher (AsyncDispatcher.java:dispatch(164)) - Dispatching the event org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptEvent.EventType: ATTEMPT_ADDED
      2014-10-05 21:25:07,331 DEBUG [AsyncDispatcher event handler] attempt.RMAppAttemptImpl (RMAppAttemptImpl.java:handle(658)) - Processing event for appattempt_1412569506932_0001_000
      001 of type ATTEMPT_ADDED
      
      2014-10-05 21:25:07,333 INFO  [AsyncDispatcher event handler] attempt.RMAppAttemptImpl (RMAppAttemptImpl.java:handle(670)) - appattempt_1412569506932_0001_000001 State change from SUBMITTED to SCHEDULED
      
      

      Attachments

        1. YARN-2649-2.patch
          1 kB
          Ming Ma
        2. YARN-2649.patch
          2 kB
          Ming Ma

        Issue Links

          Activity

            People

              mingma Ming Ma
              mingma Ming Ma
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: