Uploaded image for project: 'Hadoop Map/Reduce'
  1. Hadoop Map/Reduce
  2. MAPREDUCE-4585

Checkpoint shuffle aggregation as map output

    XMLWordPrintableJSON

Details

    • New Feature
    • Status: Open
    • Major
    • Resolution: Unresolved
    • None
    • None
    • task
    • None

    Description

      Map output collected during the shuffle can be spilled and written as a composite of map outputs. Particularly if the job employs a combiner, this checkpoint can provide fault tolerance and improve job throughput by aggregating intermediate output. The latter is especially helpful for jobs with multiple waves of reduces.

      Attachments

        1. shufflecheckpoint.pdf
          102 kB
          Carlo Curino

        Issue Links

          Activity

            People

              Unassigned Unassigned
              cdouglas Christopher Douglas
              Votes:
              0 Vote for this issue
              Watchers:
              8 Start watching this issue

              Dates

                Created:
                Updated: