Uploaded image for project: 'Apache Tez'
  1. Apache Tez
  2. TEZ-15 Support for DAG AM recovery
  3. TEZ-1024

Fix determination of failed attempts in recovery

    XMLWordPrintableJSON

Details

    • Sub-task
    • Status: Closed
    • Major
    • Resolution: Fixed
    • None
    • 0.5.0
    • None
    • None

    Description

      Current code looks at task max.attempts to determine if we have already run too many attempts in the previous attempt. However, this would miscount killed attempts (due to preemptione etc) as failed attempts. We should probably be looking at the attempt status to determine the number of failed attempts.

      Attachments

        1. Tez-1024.patch
          5 kB
          Jeff Zhang
        2. Tez-1024-2.patch
          14 kB
          Jeff Zhang
        3. Tez-1024-3.patch
          15 kB
          Jeff Zhang

        Activity

          People

            zjffdu Jeff Zhang
            bikassaha Bikas Saha
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: