Hadoop Common
  1. Hadoop Common
  2. HADOOP-3226

Run combiner when merging spills from map output

    Details

    • Type: Improvement Improvement
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 0.18.0
    • Component/s: None
    • Labels:
      None
    • Hadoop Flags:
      Incompatible change, Reviewed
    • Release Note:
      Hide
      Changed policy for running combiner. The combiner may be run multiple times as the map's output is sorted and merged. Additionally, it may be run on the reduce side as data is merged. The old semantics are available in Hadoop 0.18 if the user calls:
      job.setCombineOnlyOnce(true);
      Show
      Changed policy for running combiner. The combiner may be run multiple times as the map's output is sorted and merged. Additionally, it may be run on the reduce side as data is merged. The old semantics are available in Hadoop 0.18 if the user calls: job.setCombineOnlyOnce(true);

      Description

      When merging spills from the map, running the combiner should further diminish the volume of data we send to the reduce.

      1. 3226-3.patch
        13 kB
        Chris Douglas
      2. 3226-2.patch
        15 kB
        Chris Douglas
      3. 3226-1.patch
        15 kB
        Chris Douglas
      4. 3226-0.patch
        2 kB
        Chris Douglas

        Activity

        No work has yet been logged on this issue.

          People

          • Assignee:
            Chris Douglas
            Reporter:
            Chris Douglas
          • Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development