Hadoop Map/Reduce
  1. Hadoop Map/Reduce
  2. MAPREDUCE-2846

a small % of all tasks fail with DefaultTaskController

    Details

    • Hadoop Flags:
      Reviewed
    • Release Note:
      Fixed a race condition in writing the log index file that caused tasks to 'fail'.

      Description

      After upgrading our test 0.20.203 grid to 0.20.204-rc2, we ran terasort to verify operation. While the job completed successfully, approx 10% of the tasks failed with task runner execution errors and the inability to create symlinks for attempt logs.

      1. sync-trunk.patch
        1 kB
        Owen O'Malley
      2. sync.patch
        0.9 kB
        Owen O'Malley

        Issue Links

          Activity

          Owen O'Malley made changes -
          Status Resolved [ 5 ] Closed [ 6 ]
          Owen O'Malley made changes -
          Status Open [ 1 ] Resolved [ 5 ]
          Hadoop Flags [Reviewed]
          Fix Version/s 0.20.204.0 [ 12316318 ]
          Fix Version/s 0.23.0 [ 12315570 ]
          Resolution Fixed [ 1 ]
          Owen O'Malley made changes -
          Release Note Fixed a race condition in writing the log index file that caused tasks to fail. Fixed a race condition in writing the log index file that caused tasks to 'fail'.
          Owen O'Malley made changes -
          Release Note Fixed a race condition in writing the log index file that caused tasks to fail.
          Owen O'Malley made changes -
          Attachment sync-trunk.patch [ 12490980 ]
          Owen O'Malley made changes -
          Attachment sync.patch [ 12490979 ]
          Owen O'Malley made changes -
          Assignee Owen O'Malley [ owen.omalley ]
          Allen Wittenauer made changes -
          Summary approx 10% of all tasks fail with DefaultTaskController a small % of all tasks fail with DefaultTaskController
          Allen Wittenauer made changes -
          Link This issue is related to MAPREDUCE-2804 [ MAPREDUCE-2804 ]
          Allen Wittenauer made changes -
          Description After upgrading our test 0.20.203 grid to 0.20.204, we ran terasort to verify operation. While the job completed successfully, approx 10% of the tasks failed with task runner execution errors and the inability to create symlinks for attempt logs. After upgrading our test 0.20.203 grid to 0.20.204-rc2, we ran terasort to verify operation. While the job completed successfully, approx 10% of the tasks failed with task runner execution errors and the inability to create symlinks for attempt logs.
          Allen Wittenauer made changes -
          Field Original Value New Value
          Link This issue relates to MAPREDUCE-2415 [ MAPREDUCE-2415 ]
          Allen Wittenauer created issue -

            People

            • Assignee:
              Owen O'Malley
              Reporter:
              Allen Wittenauer
            • Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development