Uploaded image for project: 'Hadoop Map/Reduce'
  1. Hadoop Map/Reduce
  2. MAPREDUCE-1603

Add a plugin class for the TaskTracker to determine available slots

    XMLWordPrintableJSON

Details

    • New Feature
    • Status: Resolved
    • Minor
    • Resolution: Won't Fix
    • 0.22.0
    • None
    • tasktracker
    • None

    Description

      Currently the #of available map and reduce slots is determined by the configuration. MAPREDUCE-922 has proposed working things out automatically, but that is going to depend a lot on the specific tasks -hard to get right for everyone.

      There is a Hadoop cluster near me that would like to use CPU time from other machines in the room, machines which cannot offer storage, but which will have spare CPU time when they aren't running code scheduled with a grid scheduler. The nodes could run a TT which would report a dynamic number of slots, the number depending upon the current grid workload.

      I propose we add a plugin point here, so that different people can develop plugin classes that determine the amount of available slots based on workload, RAM, CPU, power budget, thermal parameters, etc. Lots of space for customisation and improvement. And by having it as a plugin: people get to integrate with whatever datacentre schedulers they have without Hadoop itself needing to be altered: the base implementation would be as today: subtract the number of active map and reduce slots from the configured values, push that out.

      Attachments

        Activity

          People

            Unassigned Unassigned
            stevel@apache.org Steve Loughran
            Votes:
            0 Vote for this issue
            Watchers:
            7 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: