Uploaded image for project: 'Apache Tez'
  1. Apache Tez
  2. TEZ-148

DAG state machine does not handle successfully completing vertices at kill wait

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • None
    • None
    • None

    Description

      A race between a kill and vertices completing successfully seems to not be handled.

      impl.DAGImpl (DAGImpl.java:handle(592)) - Can't handle this event at current state
      org.apache.hadoop.yarn.state.InvalidStateTransitonException: Invalid event: DAG_VERTEX_COMPLETED at KILL_WAIT
      at org.apache.hadoop.yarn.state.StateMachineFactory$MultipleInternalArc.doTransition(StateMachineFactory.java:385)
      at org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:299)
      at org.apache.hadoop.yarn.state.StateMachineFactory.access$300(StateMachineFactory.java:43)
      at org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:445)

      checkForCompletion ignores the current kill wait state and sends back a final state of succeeded causing a problem in the state machine as it can only go to KILL_WAIT or KILLED from KILL_WAIT.

      Attachments

        Issue Links

          Activity

            People

              mikeliddell Mike Liddell
              hitesh Hitesh Shah
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: