Hadoop Map/Reduce
  1. Hadoop Map/Reduce
  2. MAPREDUCE-1521

Protection against incorrectly configured reduces

    Details

    • Type: Improvement Improvement
    • Status: Resolved
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 0.22.1
    • Component/s: jobtracker
    • Labels:
      None

      Description

      We've seen a fair number of instances where naive users process huge data-sets (>10TB) with badly mis-configured #reduces e.g. 1 reduce.

      This is a significant problem on large clusters since it takes each attempt of the reduce a long time to shuffle and then run into problems such as local disk-space etc. Then it takes 4 such attempts.

      Proposal: Come up with heuristics/configs to fail such jobs early.

      Thoughts?

      1. MAPREDUCE-1521-0.20-yahoo.patch
        12 kB
        Mahadev konar
      2. MAPREDUCE-1521-0.20-yahoo.patch
        11 kB
        Mahadev konar
      3. MAPREDUCE-1521-0.20-yahoo.patch
        11 kB
        Mahadev konar
      4. MAPREDUCE-1521-0.20-yahoo.patch
        9 kB
        Mahadev konar
      5. MAPREDUCE-1521-0.20-yahoo.patch
        3 kB
        Mahadev konar
      6. MAPREDUCE-1521-trunk.patch
        13 kB
        Mahadev konar
      7. resourceestimator-threshold.txt
        2 kB
        Todd Lipcon
      8. resourcestimator-overflow.txt
        1 kB
        Todd Lipcon

        Issue Links

          Activity

            People

            • Assignee:
              Mahadev konar
              Reporter:
              Arun C Murthy
            • Votes:
              0 Vote for this issue
              Watchers:
              13 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development