Uploaded image for project: 'Apache Tez'
  1. Apache Tez
  2. TEZ-796

AM Hangs & does not kill containers when map-task fails

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Closed
    • Major
    • Resolution: Fixed
    • 0.3.0
    • 0.3.0
    • None
    • None

    Description

      The task hangs after a vertex fails with an error and continues to idle without returning an error.

      Looks like reducer spinups continue even after a vertex fails.

      ask_1389745080241_1218_6_04_000974,KILL_WAIT,KILLED
      task_1389745080241_1218_6_04_000976,KILL_WAIT,KILLED
      task_1389745080241_1218_6_09_000270,SCHEDULED,RUNNING
      task_1389745080241_1218_6_04_000978,KILL_WAIT,KILLED
      task_1389745080241_1218_6_04_000979,KILL_WAIT,KILLED
      task_1389745080241_1218_6_04_000980,KILL_WAIT,KILLED
      task_1389745080241_1218_6_04_000981,KILL_WAIT,KILLED
      task_1389745080241_1218_6_04_000982,KILL_WAIT,KILLED
      task_1389745080241_1218_6_04_000984,KILL_WAIT,KILLED
      task_1389745080241_1218_6_04_000987,KILL_WAIT,KILLED
      

      In the attached log file, the map vertex fails at 10:59 and DAG_FINISHED is not triggered till 11:03, at which point the AM was killed by hand.

      Attachments

        1. TEZ-796.1.patch
          6 kB
          Bikas Saha
        2. last_dag_error.txt.gz
          1.43 MB
          Gopal Vijayaraghavan

        Activity

          People

            bikassaha Bikas Saha
            gopalv Gopal Vijayaraghavan
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: