Uploaded image for project: 'Hadoop Map/Reduce'
  1. Hadoop Map/Reduce
  2. MAPREDUCE-285

Reuse output collectors across maps running on the same jvm

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Open
    • Major
    • Resolution: Unresolved
    • None
    • None
    • None
    • None

    Description

      We have evidence that cutting the shuffle-crossbar between maps and reduces (m * r) leads to perfomant applications since:

      1. It cuts down the number of connections necessary to shuffle and hence reduces load on the serving-side (TaskTracker) and improves latency (terasort, HADOOP-1338, HADOOP-5223)
      2. Reduces seeks required for the TaskTracker to serve the map-outputs

      So far we've had to manually tune applications to cut down the shuffle- crossbar by having fatter maps with custom input formats etc. For e.g. we saw a significant improvement while running the petasort when we went from ~800,000 maps to 80,00 maps (1.5G to 15G per map) i.e. from 48+ hours to 16 hours,

      The downsides are:

      1. The burden falls on the application-writer to tune this with custom input-formats etc.
      2. The naive method of using a higher min.split.size leads to considerable non-local i/o on the maps.

      Given these, the proposal is to keep the 'output collector' open across jvm reuse for maps, there-by enabling 'combiners' across map-tasks. This would have the happy-effect of fixing both the above. The downsides are that it will add latency to jobs (since map-outputs cannot be shuffled till a few maps on the same jvm are done, then followed by a final sort/merge/combine) and the failure cases get a bit more complicated.

      Thoughts? Lets discuss...

      Attachments

        Activity

          People

            Unassigned Unassigned
            acmurthy Arun Murthy
            Votes:
            1 Vote for this issue
            Watchers:
            8 Start watching this issue

            Dates

              Created:
              Updated: