Uploaded image for project: 'Apache Tez'
  1. Apache Tez
  2. TEZ-4338

Tez should consider node information to realize OUTPUT_LOST as early as possible - upstream(mapper) problems

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • 0.10.1
    • 0.10.2
    • None
    • None

    Description

      Finally, I decided to split TEZ-4139 into 2 different tasks, because handling upstream problems can be fixed independently and I'm focusing on that now

      So, from TEZ-4139, this ticket is intended to handle downstream failures as:
      collect all reported upstream mapper task attempts for a vertex, and if it's beyond a certain amount for the same source(map) host, blame mapper task immediately => blame mapper task attempt as soon as possible if read error is likely because of upstream node failure (somewhat similar goal to TEZ-3910, but currently TEZ-3910 is about to give the power of failure downstream task completely to AM)

      Attachments

        Issue Links

          Activity

            People

              abstractdog László Bodor
              abstractdog László Bodor
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Time Tracking

                  Estimated:
                  Original Estimate - Not Specified
                  Not Specified
                  Remaining:
                  Remaining Estimate - 0h
                  0h
                  Logged:
                  Time Spent - 4h 20m
                  4h 20m