Uploaded image for project: 'Apache Tez'
  1. Apache Tez
  2. TEZ-2251

Race condition in VertexImpl & Edge causes DAG to hang

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Closed
    • Major
    • Resolution: Fixed
    • None
    • 0.7.0
    • None
    • None
    • Reviewed

    Description

      Scenario:

      Vertex parallelism of "Reducer 5 & 6" happens within a span of 3 milliseconds, and tasks of "reducer 5" ends up producing wrong partition details as it sees the updated task numbers of reducer 6 when scheduled. This causes, job to hang.

      Attachments

        1. tez_2251_dag.png
          107 kB
          Rajesh Balamohan
        2. hive_console.png
          80 kB
          Rajesh Balamohan
        3. TEZ-2251.VertexImpl.patch
          7 kB
          Rajesh Balamohan
        4. TEZ-2251.fix_but_slows_down.patch
          2 kB
          Rajesh Balamohan
        5. tez-2251.vertexpatch.am.log.gz
          1.38 MB
          Rajesh Balamohan
        6. TEZ-2251.VertexImpl.readlock.patch
          4 kB
          Rajesh Balamohan
        7. TEZ-2251.2.patch
          3 kB
          Rajesh Balamohan
        8. TEZ-2251.3.patch
          3 kB
          Rajesh Balamohan

        Activity

          People

            rajesh.balamohan Rajesh Balamohan
            rajesh.balamohan Rajesh Balamohan
            Votes:
            0 Vote for this issue
            Watchers:
            5 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: