Description
TestKill.testKillTask() can fail if the async dispatcher thread is slower than the test's thread.
2017-05-26 11:43:26,532 INFO [AsyncDispatcher event handler] impl.JobImpl (JobImpl.java:handle(1006)) - job_0_0000Job Transitioned from INITED to SETUP Job State is : RUNNING Job State is : RUNNING Waiting for state : SUCCEEDED map progress : 0.0 reduce progress : 0.0 2017-05-26 11:43:26,538 INFO [CommitterEvent Processor #0] commit.CommitterEventHandler (CommitterEventHandler.java:run(231)) - Processing the event EventType: JOB_SETUP 2017-05-26 11:43:26,540 INFO [AsyncDispatcher event handler] impl.TaskImpl (TaskImpl.java:handle(661)) - task_0_0000_m_000000 Task Transitioned from NEW to KILLED 2017-05-26 11:43:26,540 ERROR [AsyncDispatcher event handler] impl.JobImpl (JobImpl.java:handle(998)) - Can't handle this event at current state org.apache.hadoop.yarn.state.InvalidStateTransitonException: Invalid event: JOB_TASK_COMPLETED at SETUP at org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:305) at org.apache.hadoop.yarn.state.StateMachineFactory.access$300(StateMachineFactory.java:46) at org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:448) at org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl.handle(JobImpl.java:996) at org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl.handle(JobImpl.java:138) at org.apache.hadoop.mapreduce.v2.app.MRAppMaster$JobEventDispatcher.handle(MRAppMaster.java:1366) at org.apache.hadoop.mapreduce.v2.app.MRAppMaster$JobEventDispatcher.handle(MRAppMaster.java:1362) at org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:182) at org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:109) at java.lang.Thread.run(Thread.java:745) 2017-05-26 11:43:26,541 INFO [AsyncDispatcher event handler] impl.JobImpl (JobImpl.java:handle(1006)) - job_0_0000Job Transitioned from SETUP to ERROR 2017-05-26 11:43:26,542 INFO [AsyncDispatcher event handler] app.MRAppMaster (MRAppMaster.java:serviceStop(978)) - Skipping cleaning up the staging dir. assuming AM will be retried.
We have to wait until the job's internal state is JobInternalState.RUNNING and not JobInternalState.SETUP.
Attachments
Attachments
Issue Links
- duplicates
-
MAPREDUCE-6815 Fix flaky TestKill.testKillTask()
- Resolved