XMLWordPrintableJSON

Details

    • Sub-task
    • Status: Closed
    • Major
    • Resolution: Fixed
    • None
    • 0.5.1
    • None
    • None

    Description

      • [Fine-grained recovery task-level] In a vertex, task 0 is done task 1 is running. History flush happens. AM dies. Once AM is recovered, task 0 is not re-run. Task 1 is re-run.
      • [Data movement types] Test AM recovery with all data movement types including 1-1, broadcast, scatter-gather with/without shuffle. AM should die in 2 scenarios: first-vertex task finishes completely and partially.
      • [Kill AM many times] Set AM max attempt to high number. Kill many attempts. Last AM can still be recovered with latest AM history data.

      Attachments

        1. Tez-1559.patch
          107 kB
          Jeff Zhang
        2. Tez-1559-2.patch
          110 kB
          Jeff Zhang
        3. Tez-1559-3.patch
          29 kB
          Jeff Zhang

        Activity

          People

            zjffdu Jeff Zhang
            zjffdu Jeff Zhang
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: