Uploaded image for project: 'Apache Tez'
  1. Apache Tez
  2. TEZ-2251

Race condition in VertexImpl & Edge causes DAG to hang

    XMLWordPrintableJSON

    Details

    • Type: Bug
    • Status: Closed
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 0.7.0
    • Component/s: None
    • Labels:
      None
    • Target Version/s:
    • Hadoop Flags:
      Reviewed

      Description

      Scenario:

      Vertex parallelism of "Reducer 5 & 6" happens within a span of 3 milliseconds, and tasks of "reducer 5" ends up producing wrong partition details as it sees the updated task numbers of reducer 6 when scheduled. This causes, job to hang.

        Attachments

        1. tez_2251_dag.png
          107 kB
          Rajesh Balamohan
        2. hive_console.png
          80 kB
          Rajesh Balamohan
        3. TEZ-2251.VertexImpl.patch
          7 kB
          Rajesh Balamohan
        4. TEZ-2251.fix_but_slows_down.patch
          2 kB
          Rajesh Balamohan
        5. tez-2251.vertexpatch.am.log.gz
          1.38 MB
          Rajesh Balamohan
        6. TEZ-2251.VertexImpl.readlock.patch
          4 kB
          Rajesh Balamohan
        7. TEZ-2251.2.patch
          3 kB
          Rajesh Balamohan
        8. TEZ-2251.3.patch
          3 kB
          Rajesh Balamohan

          Activity

            People

            • Assignee:
              rajesh.balamohan Rajesh Balamohan
              Reporter:
              rajesh.balamohan Rajesh Balamohan
            • Votes:
              0 Vote for this issue
              Watchers:
              5 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: