Details
-
Bug
-
Status: Closed
-
Critical
-
Resolution: Fixed
-
0.23.3
-
None
Description
we saw a job go into the ERROR state from an invalid state transition.
3,600 INFO [AsyncDispatcher event handler]
org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl:
attempt_1342238829791_2501_m_007743_0 TaskAttempt Transitioned from SUCCEEDED
to FAILED
2012-07-16 08:49:53,600 INFO [AsyncDispatcher event handler]
org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl:
attempt_1342238829791_2501_m_008850_0 TaskAttempt Transitioned from SUCCEEDED
to FAILED
2012-07-16 08:49:53,600 INFO [AsyncDispatcher event handler]
org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl:
attempt_1342238829791_2501_m_017344_1000 TaskAttempt Transitioned from RUNNING
to SUCCESS_CONTAINER_CLEANUP
2012-07-16 08:49:53,601 ERROR [AsyncDispatcher event handler]
org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl: Can't handle this
event at current state for attempt_1342238829791_2501_m_000027_0
org.apache.hadoop.yarn.state.InvalidStateTransitonException: Invalid event:
TA_TOO_MANY_FETCH_FAILURE at FAILED
at
org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:301)
at
org.apache.hadoop.yarn.state.StateMachineFactory.access$300(StateMachineFactory.java:43)
at
org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:443)
at
org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl.handle(TaskAttemptImpl.java:954)
at
org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl.handle(TaskAttemptImpl.java:133)
at
org.apache.hadoop.mapreduce.v2.app.MRAppMaster$TaskAttemptEventDispatcher.handle(MRAppMaster.java:913)
at
org.apache.hadoop.mapreduce.v2.app.MRAppMaster$TaskAttemptEventDispatcher.handle(MRAppMaster.java:905)
at
org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:126)
at
org.apache.hadoop.mapreduce.v2.app.recover.RecoveryService$RecoveryDispatcher.realDispatch(RecoveryService.java:285)
at
org.apache.hadoop.mapreduce.v2.app.recover.RecoveryService$RecoveryDispatcher.dispatch(RecoveryService.java:281)
at
org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:75)
at java.lang.Thread.run(Thread.java:619)
2012-07-16 08:49:53,601 INFO [AsyncDispatcher event handler]
org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl:
attempt_1342238829791_2501_m_029091_1000 TaskAttempt Transitioned from RUNNING
to SUCCESS_CONTAINER_CLEANUP
2012-07-16 08:49:53,601 INFO [IPC Server handler 17 on 47153]
org.apache.hadoop.mapred.TaskAttemptListenerImpl: Status update from
attempt_1342238829791_2501_r_000461_1000
It looks like we possibly got 2 TA_TOO_MANY_FETCH_FAILURE events. The first one moved it to FAILED and then the second one failed because no valid transition.
Attachments
Attachments
Issue Links
- is related to
-
MAPREDUCE-5409 MRAppMaster throws InvalidStateTransitonException: Invalid event: TA_TOO_MANY_FETCH_FAILURE at KILLED for TaskAttemptImpl
-
- Closed
-