Uploaded image for project: 'Apache Tez'
  1. Apache Tez
  2. TEZ-2209

Fix pipelined shuffle to fetch data from any one attempt

    XMLWordPrintableJSON

    Details

    • Type: Improvement
    • Status: Closed
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 0.7.0
    • Component/s: None
    • Labels:
      None
    • Target Version/s:
    • Hadoop Flags:
      Reviewed

      Description

      • Currently, pipelined shuffle will fail-fast the moment it receives data from an attempt other than 0. This was done as an add-on check to prevent data being copied from speculated attempts.
      • However, in some scenarios (like LLAP), it could be possible that that task attempt gets killed even before generating any data. In such cases, attempt #1 or later attempts, would generate the actual data.
      • This jira is created to allow pipelined shuffle to download data from any one attempt.

        Attachments

        1. TEZ-2209.1.patch
          29 kB
          Rajesh Balamohan
        2. TEZ-2209.2.patch
          32 kB
          Rajesh Balamohan
        3. TEZ-2209.3.patch
          32 kB
          Rajesh Balamohan
        4. TEZ-2209.4.patch
          33 kB
          Rajesh Balamohan

          Activity

            People

            • Assignee:
              rajesh.balamohan Rajesh Balamohan
              Reporter:
              rajesh.balamohan Rajesh Balamohan
            • Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: