Uploaded image for project: 'Apache Tez'
  1. Apache Tez
  2. TEZ-3052

Task internal error due to Invalid event: T_ATTEMPT_FAILED at FAILED

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Closed
    • Major
    • Resolution: Fixed
    • 0.7.0
    • 0.7.1, 0.8.3
    • None
    • None
    • Reviewed

    Description

      A task encountered an internal error due to "Invalid event: T_ATTEMPT_FAILED at FAILED". The task had two outstanding attempts, as one was speculative. The main attempt failed causing the task to fail, and when the speculative attempt subsequently failed the T_ATTEMPT_FAILED triggered the invalid state transition.

      It appears there needs to be some hardening of the TaskImpl state machine in light of speculative attempt events arriving late. Besides this scenario I think there may be others, e.g.: speculative attempt succeeding just as overall task fails appears to be unhandled.

      Attachments

        1. TEZ-3052.001.patch
          5 kB
          Jason Darrell Lowe

        Activity

          People

            jlowe Jason Darrell Lowe
            jlowe Jason Darrell Lowe
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: