Uploaded image for project: 'Hadoop Common'
  1. Hadoop Common
  2. HADOOP-1270

Randomize the fetch of map outputs

    XMLWordPrintableJSON

    Details

    • Type: Improvement
    • Status: Closed
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: 0.12.3
    • Fix Version/s: 0.13.0
    • Component/s: None
    • Labels:
      None

      Description

      HADOOP-248 did away with random probing of maps for locating map outputs and instead we now rely on TaskCompletionEvents for the same.

      However we lost out on the benefit that the randomization in probing resulted in an added benefit where the map's jetty isn't overloaded with requests for the outputs. We have now a situation where a map completes, the JT is notified, all the reduces get the TaskCompletionEvent and pretty much swamp the poor map's jetty and this repeats for each map.

      I propose we make a minor change where we collect a set of TaskCompletionEvents and randomize the list before firing the fetches. Should help fix this mass-hysteria at the map's jetty.

      Thoughts?

        Attachments

        1. HADOOP-1270_20070425_1.patch
          5 kB
          Arun C Murthy
        2. HADOOP-1270_20070504_2.patch
          5 kB
          Arun C Murthy
        3. HADOOP-1270_20070505_3.patch
          5 kB
          Arun C Murthy
        4. post-H-1270.png
          27 kB
          Arun C Murthy
        5. pre-H-1270.png
          34 kB
          Arun C Murthy

          Activity

            People

            • Assignee:
              acmurthy Arun C Murthy
              Reporter:
              acmurthy Arun C Murthy
            • Votes:
              0 Vote for this issue
              Watchers:
              0 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: