Issue Details (XML | Word | Printable)

Key: HADOOP-3876
Type: New Feature New Feature
Status: Open Open
Priority: Major Major
Assignee: Unassigned
Reporter: Vivek Ratan
Votes: 0
Watchers: 7
Operations

If you were logged in you would be able to see more operations.
Hadoop Common

Hadoop Core should support source filesfor multiple schedulers

Created: 31/Jul/08 05:22 AM   Updated: 11/Dec/08 03:32 AM
Return to search
Component/s: None
Affects Version/s: None
Fix Version/s: None

Time Tracking:
Not Specified


 Description  « Hide
Besides the default JT scheduling algorithm, there is work going on with at least two more schedulers (HADOOP-3445, HADOOP-3746). HADOOP-3412 makes it easier to plug in new schedulers into the JT. Where do we place the source files for various schedulers so that it's easy for users to choose their scheduler of choice during deployment, and easy for developers to add in more schedulers into the framework (without inundating it).

 All   Comments   Work Log   Change History   Subversion Commits      Sort Order: Ascending order - Click to sort in descending order
Vivek Ratan added a comment - 31/Jul/08 06:29 AM

Vivek Ratan added a comment - 31/Jul/08 06:49 AM
My guess is, we're going to have some 'core' schedulers (which seem appropriate to belong within the core Hadoop code), and some that are better suited as contrib projects. We should probably place core schedulers under src/mapred/schedulers. So, for example, the 3445 scheduler would go under src/mapred/schedulers/3445 and the 3746 scheduler under src/mapred/schedulers/3746 (replace '3445' and '3746' with more appropriate names, if you wish). Others may go in here as well, if its' felt that they're likely to be deployed in many scenarios, or for whatever other reason. Non-core schedulers can probably go under contrib.

Another option is to place core schedulers under src/core/schedulers, if we want these to be more than just MR schedulers, but folks may not write non-MR schedulers for a while. I prefer keeping stuff under mapred.


Matei Zaharia added a comment - 31/Jul/08 11:03 PM
Just a note, if we do subpackages, we will need a semi-public scheduler API (HADOOP-3822), because the default visibility in Java doesn't apply in subpackages. On the other hand, I think subpackages is definitely the way to go to make this scalable and clean.

Tom White added a comment - 07/Aug/08 10:47 AM
+1 on subpackages (either in src/mapred or src/contrib).

Also, I think that we need to split out the MapReduce daemons into a server package (HADOOP-3916) - which we won't publish javadoc for - before we can do HADOOP-3822 properly. This argues for committing the fair scheduler HADOOP-3746 as a contrib package (as it currently stands) - this can be done now, or into core after HADOOP-3916 and HADOOP-3822 have been done.


David Litster added a comment - 10/Dec/08 11:23 PM
Sorry to add my $0.02 so late in the process, but since you already use Torque and Condor to actually spawn the Hadoop Clusters and start jobs, have you considered adding functionality to HOD to allow external widely-used schedulers (such as the open-source Maui, Moab, PBSpro, LSF, etc. ) to control the scheduling of HOD clusters and jobs via the above-mentioned API (or the command-line or web service APIs)? This would allow sites that already have an existing scheduler and want to add the ability to run Hadoop jobs to be able to do so while taking advantage of their existing infrastructure in terms of users, SLAs, priorities, accounts, groups, etc.

Thanks for all the effort!


Hemanth Yamijala added a comment - 11/Dec/08 03:32 AM
David,

Some clarifications:

  • We do not support Condor in HOD, only Torque
  • HOD serves its purpose well as a good provisioning system for Hadoop, and is likely to live on to assist test environments where a large cluster can be shared by multiple developers / testers each having their own version of Hadoop to deploy and test.
  • I think it it unlikely for HOD to exist as a scheduling solution for Hadoop, built on top of the resource managers you've mentioned in your comment. That functionality is added by scheduling solutions being built into Hadoop itself - for e.g. the Capacity Scheduler (HADOOP-3445) and the Fair Scheduler (HADOOP-3746).
  • We did consider adding support for other resource managers in HOD, precisely for reasons you state - it will have an easier entry into places which are using other open source resource manager / scheduling solutions. However, given the shift in direction from HOD, this consideration did not proceed further.
  • The API in HOD did plan for accomodating other schedulers - you could look at src/contrib/hod/hodlib/Hod/nodePool.py. It is likely that the abstraction may not be fully suitable, but still this would be the place to start if we want to plugin support for other schedulers.

Hope that answers some of your questions.