Uploaded image for project: 'Apache Tez'
  1. Apache Tez
  2. TEZ-2404

Handle DataMovementEvent before its TaskAttemptCompletedEvent

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Closed
    • Critical
    • Resolution: Fixed
    • None
    • 0.7.0
    • None
    • None
    • Reviewed

    Description

      TEZ-2325 route TASK_ATTEMPT_COMPLETED_EVENT directly to the attempt, but it would cause recovery issue. Recovery need that DataMovement event is handled before TaskAttemptCompletedEvent, otherwise DataMovement event may be lost in recovering and cause the its dependent tasks hang.

      2 Ways to fix this issue.

      1. Still route TaskAtttemptCompletedEvent in Vertex
      2. route DataMovementEvent before TaskAttemptCompeltedEvent in TezTaskAttemptListener

      Attachments

        1. TEZ-2404-3.patch
          8 kB
          Jeff Zhang
        2. TEZ-2404-2.patch
          7 kB
          Jeff Zhang
        3. TEZ-2404-1.patch
          6 kB
          Jeff Zhang

        Issue Links

          Activity

            People

              zjffdu Jeff Zhang
              zjffdu Jeff Zhang
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: