Uploaded image for project: 'Hadoop Map/Reduce'
  1. Hadoop Map/Reduce
  2. MAPREDUCE-1859

maxConcurrentMapTask & maxConcurrentReduceTask per job

    XMLWordPrintableJSON

Details

    • New Feature
    • Status: Resolved
    • Major
    • Resolution: Won't Fix
    • 0.20.2
    • None
    • job submission
    • None

    Description

      It would be valuable if one could specify the max number of map/reduce slots which should be used for a given job. An example would be an map-reduce job importing from a database where you don't want 50 map tasks querying one db at a time but also you don't want to shrink the overall map task count.
      Also this is probably already possible through Fair/Capacity-Scheduler or an own Extension i think it would be a good addition for the default TaskScheduler since this seems to be more then a rare used feature.
      This would have the benefit in situations where you don't have control/ownership over the cluster as well.
      And its more job-centric whereas the existing scheduler extensions seems to be more job-type-centric.

      Implementing this feature should be relatively straightforward. Adding something like jobConf.setMaxConcurrentMapTask(int) and respecting this configuration in JobQueueTaskScheduler.

      Not sure if this feature would be harmonical with the existing Fair/Capacity-Schedulers.

      Attachments

        Activity

          People

            Unassigned Unassigned
            oae Johannes Zillmann
            Votes:
            0 Vote for this issue
            Watchers:
            7 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: