Uploaded image for project: 'Hadoop Map/Reduce'
  1. Hadoop Map/Reduce
  2. MAPREDUCE-5060

Fetch failures that time out only count against the first map task

    Details

    • Type: Bug
    • Status: Closed
    • Priority: Critical
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 0.23.7, 2.1.0-beta
    • Component/s: None
    • Labels:
      None
    • Target Version/s:
    • Hadoop Flags:
      Reviewed

      Description

      When a fetch failure happens, if the socket has already "connected" it is only counted against the first map task. But most of the time it is because of an issue with the Node itself, not the individual map task, and as such all failures when trying to initiate the connection should count against all of the tasks.

      This caused a particularly unfortunate job to take an hour an a half longer then it needed to.

        Attachments

        1. MR-5060.txt
          2 kB
          Robert Joseph Evans
        2. MR-5060.txt
          5 kB
          Robert Joseph Evans
        3. MR-5060-trunk.txt
          5 kB
          Robert Joseph Evans

          Activity

            People

            • Assignee:
              revans2 Robert Joseph Evans
              Reporter:
              revans2 Robert Joseph Evans
            • Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: