Uploaded image for project: 'Apache Tez'
  1. Apache Tez
  2. TEZ-3803

Tasks can get killed due to insufficient progress while waiting for shuffle inputs to complete

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Closed
    • Critical
    • Resolution: Fixed
    • None
    • 0.9.1
    • None
    • None
    • Reviewed

    Description

      In a scenario where a downstream task has no slow start and gets started before all its shuffle inputs are done, the task can timeout as the wait does not notify progress( set the "progress is being made bit") like it does in MapReduce.

      Attachments

        1. TEZ-3803.005.patch
          8 kB
          Kuhu Shukla
        2. TEZ-3803.004.patch
          8 kB
          Kuhu Shukla
        3. TEZ-3803.003.patch
          10 kB
          Kuhu Shukla
        4. TEZ-3803.002.patch
          13 kB
          Kuhu Shukla
        5. TEZ-3803.001.patch
          14 kB
          Kuhu Shukla

        Activity

          People

            kshukla Kuhu Shukla
            kshukla Kuhu Shukla
            Votes:
            0 Vote for this issue
            Watchers:
            5 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: