Uploaded image for project: 'Hadoop Map/Reduce'
  1. Hadoop Map/Reduce
  2. MAPREDUCE-4585

Checkpoint shuffle aggregation as map output

Add voteVotersWatch issueWatchersCreate sub-taskLinkCloneUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

Details

    • New Feature
    • Status: Open
    • Major
    • Resolution: Unresolved
    • None
    • None
    • task
    • None

    Description

      Map output collected during the shuffle can be spilled and written as a composite of map outputs. Particularly if the job employs a combiner, this checkpoint can provide fault tolerance and improve job throughput by aggregating intermediate output. The latter is especially helpful for jobs with multiple waves of reduces.

      Attachments

        1. shufflecheckpoint.pdf
          102 kB
          Carlo Curino

        Issue Links

        Activity

          This comment will be Viewable by All Users Viewable by All Users
          Cancel

          People

            Unassigned Unassigned
            cdouglas Christopher Douglas

            Dates

              Created:
              Updated:

              Slack

                Issue deployment