Uploaded image for project: 'Hadoop Common'
  1. Hadoop Common
  2. HADOOP-1077

Race condition in fetching map outputs (might lead to hung reduces)

    Details

    • Type: Bug
    • Status: Closed
    • Priority: Blocker
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 0.12.1
    • Component/s: None
    • Labels:
      None

      Description

      Sometimes when a map task is lost while the map-output fetch is happening from the TT for that task, and the lost map has successfully executed on some other node, the event for that successful execution is lost at the fetching TT. The fetching TT might eventually fail to fetch the output for the lost task, but then since the event for the new run of the lost map might also have been lost, the fetching TT might hang.

      This "hung" problem was discovered while working on HADOOP-1060.

        Attachments

        1. 1077.patch
          7 kB
          Devaraj Das
        2. 1077.2.patch
          6 kB
          Arun C Murthy

          Issue Links

            Activity

              People

              • Assignee:
                devaraj Devaraj Das
                Reporter:
                devaraj Devaraj Das
              • Votes:
                0 Vote for this issue
                Watchers:
                0 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: