Description
A race between a kill and vertices completing successfully seems to not be handled.
impl.DAGImpl (DAGImpl.java:handle(592)) - Can't handle this event at current state
org.apache.hadoop.yarn.state.InvalidStateTransitonException: Invalid event: DAG_VERTEX_COMPLETED at KILL_WAIT
at org.apache.hadoop.yarn.state.StateMachineFactory$MultipleInternalArc.doTransition(StateMachineFactory.java:385)
at org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:299)
at org.apache.hadoop.yarn.state.StateMachineFactory.access$300(StateMachineFactory.java:43)
at org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:445)
checkForCompletion ignores the current kill wait state and sends back a final state of succeeded causing a problem in the state machine as it can only go to KILL_WAIT or KILLED from KILL_WAIT.
Attachments
Issue Links
- relates to
-
TEZ-141 DAG does not kill running vertices when going into failed state
- Resolved