Uploaded image for project: 'Apache Tez'
  1. Apache Tez
  2. TEZ-3803

Tasks can get killed due to insufficient progress while waiting for shuffle inputs to complete

    XMLWordPrintableJSON

    Details

    • Type: Bug
    • Status: Closed
    • Priority: Critical
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 0.9.1
    • Component/s: None
    • Labels:
      None
    • Hadoop Flags:
      Reviewed

      Description

      In a scenario where a downstream task has no slow start and gets started before all its shuffle inputs are done, the task can timeout as the wait does not notify progress( set the "progress is being made bit") like it does in MapReduce.

        Attachments

        1. TEZ-3803.001.patch
          14 kB
          Kuhu Shukla
        2. TEZ-3803.002.patch
          13 kB
          Kuhu Shukla
        3. TEZ-3803.003.patch
          10 kB
          Kuhu Shukla
        4. TEZ-3803.004.patch
          8 kB
          Kuhu Shukla
        5. TEZ-3803.005.patch
          8 kB
          Kuhu Shukla

          Activity

            People

            • Assignee:
              kshukla Kuhu Shukla
              Reporter:
              kshukla Kuhu Shukla
            • Votes:
              0 Vote for this issue
              Watchers:
              5 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: