Uploaded image for project: 'Hadoop Map/Reduce'
  1. Hadoop Map/Reduce
  2. MAPREDUCE-4228

mapreduce.job.reduce.slowstart.completedmaps is not working properly to delay the scheduling of the reduce tasks

    XMLWordPrintableJSON

Details

    Description

      If no more map tasks need to be scheduled but not all have completed, the ApplicationMaster will start scheduling reducers even if the number of completed maps has not met the mapreduce.job.reduce.slowstart.completedmaps threshold. For example, if the property is set to 1.0 all maps should complete before any reducers are scheduled. However the reducers are scheduled as soon as the last map task is assigned to a container. For a job with very long-running maps, a cluster with enough capacity to launch all map tasks could cause reducers to launch prematurely and waste cluster resources.

      Thanks to Phil Su for discovering this issue.

      Attachments

        1. MAPREDUCE-4228.patch
          10 kB
          Jason Darrell Lowe
        2. MAPREDUCE-4228.patch
          10 kB
          Jason Darrell Lowe
        3. MAPREDUCE-4228.patch
          10 kB
          Jason Darrell Lowe

        Activity

          People

            jlowe Jason Darrell Lowe
            jlowe Jason Darrell Lowe
            Votes:
            2 Vote for this issue
            Watchers:
            10 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: