Uploaded image for project: 'Hadoop Map/Reduce'
  1. Hadoop Map/Reduce
  2. MAPREDUCE-6870

Add configuration for MR job to finish when all reducers are complete (even with unfinished mappers)

    Details

    • Type: Improvement
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: 2.6.1
    • Fix Version/s: 3.0.0-beta1
    • Component/s: None
    • Labels:
      None
    • Target Version/s:
    • Hadoop Flags:
      Incompatible change, Reviewed
    • Release Note:
      Hide
      Enables mapreduce.job.finish-when-all-reducers-done by default. With this enabled, a MapReduce job will complete as soon as all of its reducers are complete, even if some mappers are still running. This can occur if a mapper was relaunched after node failure but the relaunched task's output is not actually needed. Previously the job would wait for all mappers to complete.
      Show
      Enables mapreduce.job.finish-when-all-reducers-done by default. With this enabled, a MapReduce job will complete as soon as all of its reducers are complete, even if some mappers are still running. This can occur if a mapper was relaunched after node failure but the relaunched task's output is not actually needed. Previously the job would wait for all mappers to complete.

      Description

      Even with MAPREDUCE-5817, there could still be cases where mappers get scheduled before all reducers are complete, but those mappers run for long time, even after all reducers are complete. This could hurt the performance of large MR jobs.

      In some cases, mappers don't have any materialize-able outcome other than providing intermediate data to reducers. In that case, the job owner should have the config option to finish the job once all reducers are complete.

        Attachments

        1. MAPREDUCE-6870-007.patch
          12 kB
          Peter Bacsko
        2. MAPREDUCE-6870-006.patch
          12 kB
          Peter Bacsko
        3. MAPREDUCE-6870-005.patch
          12 kB
          Peter Bacsko
        4. MAPREDUCE-6870-004.patch
          12 kB
          Peter Bacsko
        5. MAPREDUCE-6870-003.patch
          15 kB
          Peter Bacsko
        6. MAPREDUCE-6870-002.patch
          11 kB
          Peter Bacsko
        7. MAPREDUCE-6870-001.patch
          16 kB
          Peter Bacsko

          Issue Links

            Activity

              People

              • Assignee:
                pbacsko Peter Bacsko
                Reporter:
                zhz Zhe Zhang
              • Votes:
                0 Vote for this issue
                Watchers:
                12 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: