Details

    • Type: New Feature New Feature
    • Status: Reopened
    • Priority: Minor Minor
    • Resolution: Unresolved
    • Affects Version/s: None
    • Fix Version/s: None
    • Component/s: None
    • Labels:
      None
    • Tags:
      scheduler, scheduling

      Description

      The Adaptive Scheduler is a pluggable Hadoop scheduler that automatically adjusts the amount of used resources depending on the performance of jobs and on user-defined high-level business goals.

      Existing Hadoop schedulers are focused on managing large, static clusters in which nodes are added or removed manually. On the other hand, the goal of this scheduler is to improve the integration of Hadoop and the applications that run on top of it with environments that allow a more dynamic provisioning of resources.

      The current implementation is quite straightforward. Users specify a deadline at job submission time, and the scheduler adjusts the resources to meet that deadline (at the moment, the scheduler can be configured to either minimize or maximize the amount of resources). If multiple jobs are run simultaneously, the scheduler prioritizes them by deadline. Note that the current approach to estimate the completion time of jobs is quite simplistic: it is based on the time it takes to finish each task, so it works well with regular jobs, but there is still room for improvement for unpredictable jobs.

      The idea is to further integrate it with cloud-like and virtual environments (such as Amazon EC2, Emotive, etc.) so that if, for instance, a job isn't able to meet its deadline, the scheduler automatically requests more resources.

      1. MAPREDUCE-1380_0.1.patch
        30 kB
        Jordà Polo
      2. MAPREDUCE-1380_1.1.patch
        45 kB
        Jordà Polo
      3. MAPREDUCE-1380_1.1.pdf
        235 kB
        Jordà Polo
      4. MAPREDUCE-1380-branch-1.2.patch
        68 kB
        Jordà Polo

        Activity

        Hide
        Jordà Polo added a comment -

        This is still a work in progress, but I'll be submitting a patch and more details about the implementation soon. In the meantime, feel free to share your thoughts and suggestions.

        Thanks.

        Show
        Jordà Polo added a comment - This is still a work in progress, but I'll be submitting a patch and more details about the implementation soon. In the meantime, feel free to share your thoughts and suggestions. Thanks.
        Hide
        Arun C Murthy added a comment -

        Note that the current approach to estimate the completion time of jobs is quite simplistic: it is based on the time it takes to finish each task, so it works well with regular jobs

        Polo - Can you please expand on your definition of 'regular' jobs? Are these, for e.g. part of regular workflows? IAC, how do you propose to communicate this information to the AdaptiveScheduler?

        Show
        Arun C Murthy added a comment - Note that the current approach to estimate the completion time of jobs is quite simplistic: it is based on the time it takes to finish each task, so it works well with regular jobs Polo - Can you please expand on your definition of 'regular' jobs? Are these, for e.g. part of regular workflows? IAC, how do you propose to communicate this information to the AdaptiveScheduler?
        Hide
        Jordà Polo added a comment -

        Can you please expand on your definition of 'regular' jobs? Are these, for e.g. part of regular workflows? IAC, how do you propose to communicate this information to the AdaptiveScheduler?

        Actually, "regular" isn't really appropriate here, thanks for pointing that out.

        I actually meant uniform or homogeneous jobs, that is, jobs in which all the tasks take approximately the same amount of time to finish. It would be interesting to communicate some additional data, but so far it only uses standard information as provided by tasktrackers.

        Show
        Jordà Polo added a comment - Can you please expand on your definition of 'regular' jobs? Are these, for e.g. part of regular workflows? IAC, how do you propose to communicate this information to the AdaptiveScheduler? Actually, "regular" isn't really appropriate here, thanks for pointing that out. I actually meant uniform or homogeneous jobs, that is, jobs in which all the tasks take approximately the same amount of time to finish. It would be interesting to communicate some additional data, but so far it only uses standard information as provided by tasktrackers.
        Hide
        Jordà Polo added a comment -

        I'm attaching a patch with an initial version of the scheduler.

        As I said, this is still a work in progress and I'll be posting new versions as they are ready. There is still some work left to make it useful for everyone and not just for our own needs, but I wanted to contribute it now since it may be of interest to other people.

        (I'll also be posting a PDF with additional documentation later today.)

        Show
        Jordà Polo added a comment - I'm attaching a patch with an initial version of the scheduler. As I said, this is still a work in progress and I'll be posting new versions as they are ready. There is still some work left to make it useful for everyone and not just for our own needs, but I wanted to contribute it now since it may be of interest to other people. (I'll also be posting a PDF with additional documentation later today.)
        Hide
        steve_l added a comment -

        Not sure I'd put the VM request policy in the scheduler. Better to give it some way of notifying something that there isn't enough resources, include data on user and data, and give that other thing the ability to add machines if it so chooses. There may be other concerns like per-user quota, overall costs, etc, as well as the security issue of giving your scheduler the credentials to work with the infrastructure.

        Show
        steve_l added a comment - Not sure I'd put the VM request policy in the scheduler. Better to give it some way of notifying something that there isn't enough resources, include data on user and data, and give that other thing the ability to add machines if it so chooses. There may be other concerns like per-user quota, overall costs, etc, as well as the security issue of giving your scheduler the credentials to work with the infrastructure.
        Hide
        Jordà Polo added a comment -

        Not sure I'd put the VM request policy in the scheduler. Better to give it some way of notifying something that there isn't enough resources, include data on user and data, and give that other thing the ability to add machines if it so chooses. There may be other concerns like per-user quota, overall costs, etc, as well as the security issue of giving your scheduler the credentials to work with the infrastructure.

        Good point. The current description doesn't explain much, but that's exactly what we have in mind: a multi-tiered system in which the Hadoop scheduler just provides information to the resource manager/provider.

        Show
        Jordà Polo added a comment - Not sure I'd put the VM request policy in the scheduler. Better to give it some way of notifying something that there isn't enough resources, include data on user and data, and give that other thing the ability to add machines if it so chooses. There may be other concerns like per-user quota, overall costs, etc, as well as the security issue of giving your scheduler the credentials to work with the infrastructure. Good point. The current description doesn't explain much, but that's exactly what we have in mind: a multi-tiered system in which the Hadoop scheduler just provides information to the resource manager/provider.
        Hide
        Jordà Polo added a comment -

        I'm sending a new version of the Adaptive Scheduler.

        This new version is actually a new implementation with a different architecture roughly described in the attached PDF document. It supports the same features as the previous version, but at the same time provides new features and a framework for future improvements.

        The new features are mostly focused on making the scheduler more aware of the resources and allowing a dynamic number of running tasks depending on the jobs and their need for resources (instead of a fixed number of slots).

        It is still a work in progress and requires some additional tuning, but I thought it would be interesting to publish it as it is now given some of the ideas that have been proposed for Hadoop MapReduce NextGen (MAPREDUCE-279). The scheduler currently leverages job profiling information to ensure optimal cluster utilization, but our goal is to get rid of this kind of profiles and implement a more dynamic approach (e.g. using resource information data introduced by MAPREDUCE-1218).

        I still don't know what's the status of the "NextGen" proposal and its implementation. But as soon as more details about NextGen are revealed we'll see whether it makes sense and it is worth/useful to adapt or use some of the ideas in the new Hadoop MapReduce architecture.

        Show
        Jordà Polo added a comment - I'm sending a new version of the Adaptive Scheduler. This new version is actually a new implementation with a different architecture roughly described in the attached PDF document. It supports the same features as the previous version, but at the same time provides new features and a framework for future improvements. The new features are mostly focused on making the scheduler more aware of the resources and allowing a dynamic number of running tasks depending on the jobs and their need for resources (instead of a fixed number of slots). It is still a work in progress and requires some additional tuning, but I thought it would be interesting to publish it as it is now given some of the ideas that have been proposed for Hadoop MapReduce NextGen ( MAPREDUCE-279 ). The scheduler currently leverages job profiling information to ensure optimal cluster utilization, but our goal is to get rid of this kind of profiles and implement a more dynamic approach (e.g. using resource information data introduced by MAPREDUCE-1218 ). I still don't know what's the status of the "NextGen" proposal and its implementation. But as soon as more details about NextGen are revealed we'll see whether it makes sense and it is worth/useful to adapt or use some of the ideas in the new Hadoop MapReduce architecture.
        Hide
        Jordà Polo added a comment -

        Patch against trunk.

        Show
        Jordà Polo added a comment - Patch against trunk.
        Hide
        abc added a comment -

        Where I can download this adaptive scheduler. I am not able to find, please help me

        Show
        abc added a comment - Where I can download this adaptive scheduler. I am not able to find, please help me
        Hide
        Harsh J added a comment -

        Unsure why this was resolved as fixed. Its not been committed anywhere, so reopening as unresolved.

        Show
        Harsh J added a comment - Unsure why this was resolved as fixed. Its not been committed anywhere, so reopening as unresolved.
        Hide
        Priyanka added a comment -

        Where can I download the whole adaptive scheduler. The patch given is showing some error.

        Show
        Priyanka added a comment - Where can I download the whole adaptive scheduler. The patch given is showing some error.
        Hide
        Chen He added a comment -

        This patch may need to be updated against Hadoop 1.x or 2.x

        Show
        Chen He added a comment - This patch may need to be updated against Hadoop 1.x or 2.x
        Hide
        Jordà Polo added a comment -

        Attaching a more up-to-date version of the scheduler that should apply cleanly against 1.2.x.

        Show
        Jordà Polo added a comment - Attaching a more up-to-date version of the scheduler that should apply cleanly against 1.2.x.

          People

          • Assignee:
            Unassigned
            Reporter:
            Jordà Polo
          • Votes:
            0 Vote for this issue
            Watchers:
            31 Start watching this issue

            Dates

            • Created:
              Updated:

              Development