Hadoop Map/Reduce
  1. Hadoop Map/Reduce
  2. MAPREDUCE-4584

Umbrella: Preemption and restart of MapReduce tasks

    Details

    • Target Version/s:

      Description

      This JIRA will track the implementation of improvements to the handling of intermediate data (e.g., map output). Specifically, it tracks changes in support of preempting running tasks, checkpointing completed work, and spawning one or more tasks to complete the original split/partition. These mechanisms allow one to manage skew in intermediate data, respond to resource abundance or scarcity (particularly with preemption), speculatively execute on the remaining work from checkpointed tasks, and automatically tune parameters for performance.

      Iterations will build on learnings from previous work, including the following:

      Technical reports:
      http://research.yahoo.com/files/yl-2012-002.pdf
      http://research.yahoo.com/files/yl-2012-003.pdf

      Source code:
      http://code.google.com/p/sailfish

        Issue Links

          Activity

          Hide
          Chris Douglas added a comment -

          Tsuyoshi Ozawa: I've been reading some of the iterations of your patch(es) as you've updated them over the last few months. Our proposals are absolutely complementary. Your approach (IIRC) involved reusing map tasks to aggregate map output on the same host, right? MAPREDUCE-4502 can accomplish more than checkpointing by aggregating across partitions.

          We added some metadata to IFile to track which task attempts a segment contains. I haven't looked at a recent version of your patch, but that's certainly shared functionality.

          Show
          Chris Douglas added a comment - Tsuyoshi Ozawa : I've been reading some of the iterations of your patch(es) as you've updated them over the last few months. Our proposals are absolutely complementary. Your approach (IIRC) involved reusing map tasks to aggregate map output on the same host, right? MAPREDUCE-4502 can accomplish more than checkpointing by aggregating across partitions. We added some metadata to IFile to track which task attempts a segment contains. I haven't looked at a recent version of your patch, but that's certainly shared functionality.
          Hide
          Tsuyoshi Ozawa added a comment -

          I agree with your strategy. I'm working in MAPREDUCE-4502, a related work of yours, however the patch become too large to review. Now I've planed to split the patches, but the change of your work affects my work. Therefore, I'd like to work with your strategy. Essentialy, your proposal and the node-level map-side aggregation(MAPREDUCE-4502) are complement each other, therefore the impact on performance can get much better if all features are included in MapReduce.

          One proposal is: using node-level aggregation as an optimization technique of reducer-side preemption. If a lot of IFiles are needed to fetch and the job is an aggregation type, mapper-side aggregation is more effective to reduce the size of fetching than fetching in parallel by using reducer preemption. Cooperating these features or switching strategy is possible. Any idea?

          Show
          Tsuyoshi Ozawa added a comment - I agree with your strategy. I'm working in MAPREDUCE-4502 , a related work of yours, however the patch become too large to review. Now I've planed to split the patches, but the change of your work affects my work. Therefore, I'd like to work with your strategy. Essentialy, your proposal and the node-level map-side aggregation( MAPREDUCE-4502 ) are complement each other, therefore the impact on performance can get much better if all features are included in MapReduce. One proposal is: using node-level aggregation as an optimization technique of reducer-side preemption. If a lot of IFiles are needed to fetch and the job is an aggregation type, mapper-side aggregation is more effective to reduce the size of fetching than fetching in parallel by using reducer preemption. Cooperating these features or switching strategy is possible. Any idea?
          Hide
          Chris Douglas added a comment -

          Our strategy is not to port existing code to Hadoop, but rather to iterate on the existing framework and gradually introduce mechanisms that support these goals. The experimental support in the referenced reports should help to justify these improvements, though no single iteration should cause a regression. The subtasks are intended to outline our plan in broad strokes, leaving plenty of space for collaboration and refinement as we learn.

          Show
          Chris Douglas added a comment - Our strategy is not to port existing code to Hadoop, but rather to iterate on the existing framework and gradually introduce mechanisms that support these goals. The experimental support in the referenced reports should help to justify these improvements, though no single iteration should cause a regression. The subtasks are intended to outline our plan in broad strokes, leaving plenty of space for collaboration and refinement as we learn.

            People

            • Assignee:
              Chris Douglas
              Reporter:
              Sriram Rao
            • Votes:
              0 Vote for this issue
              Watchers:
              27 Start watching this issue

              Dates

              • Created:
                Updated:

                Development