Details

    • Type: New Feature New Feature
    • Status: Resolved
    • Priority: Major Major
    • Resolution: Duplicate
    • Affects Version/s: None
    • Fix Version/s: None
    • Component/s: None
    • Labels:
      None

      Description

      It should be possible to specify a limit to the number of tasks per job permitted to run simultaneously. If, for example, you have a cluster of 50 nodes, with 100 map task slots and 100 reduce task slots, and the configured limit is 25 simultaneous tasks/job, then four or more jobs will be able to run at a time. This will permit short jobs to pass longer-running jobs. This also avoids some problems we've seen with HOD, where nodes are underutilized in their tail, and it should permit improved input locality.

        Issue Links

          Activity

          Hide
          Doug Cutting added a comment -

          This addresses issues raised in HADOOP-2510.

          Show
          Doug Cutting added a comment - This addresses issues raised in HADOOP-2510 .
          Hide
          Arun C Murthy added a comment -

          I'd like to throw job priority into this festering pool...

          At least changing the job-priority (done by the cluster-admin) should in a change in number of max_slots... thoughts?

          Show
          Arun C Murthy added a comment - I'd like to throw job priority into this festering pool... At least changing the job-priority (done by the cluster-admin) should in a change in number of max_slots... thoughts?
          Hide
          Doug Cutting added a comment -
          Show
          Doug Cutting added a comment - Some discussion of this issue may be found at: http://www.nabble.com/question-about-file-glob-in-hadoop-0.15-tt14702242.html#a14741794
          Hide
          Doug Cutting added a comment -

          I think a static limit for all jobs would be useful and best to implement first. After some experience with this, we would be better able to address its shortcomings. Possible future extensions might be:

          • dynamically altering the limit, e.g., limit=max(min.tasks.per.job, numSlots/numJobsOutstanding)
            • ramping up the limit slowly, so that a users's sequential jobs don't have all their slots immediately taken when one job completes
            • ramping down the limit slowly, so that tasks are given an opportunity to finish normally before they are killed.
          • incorporating job priority into the limit
          Show
          Doug Cutting added a comment - I think a static limit for all jobs would be useful and best to implement first. After some experience with this, we would be better able to address its shortcomings. Possible future extensions might be: dynamically altering the limit, e.g., limit=max(min.tasks.per.job, numSlots/numJobsOutstanding) ramping up the limit slowly, so that a users's sequential jobs don't have all their slots immediately taken when one job completes ramping down the limit slowly, so that tasks are given an opportunity to finish normally before they are killed. incorporating job priority into the limit
          Hide
          Ted Dunning added a comment - - edited

          (oops... yes, doug anticipated this in his comment and I didn't read very well)

          Presumably the limit could be made dynamic. The limit could be max(static_limit, number of cores in cluster / # active jobs)

          On further reflection, I should note that my big jobs are all limited in pretty much the way that Doug suggests because they are processing a few large files that are unsplittable. This limits the number of slots these big jobs can eat up.

          The result is pretty OK. My little jobs with lots of maps can slide through the cracks most of the time and everything runs pretty well.

          Show
          Ted Dunning added a comment - - edited (oops... yes, doug anticipated this in his comment and I didn't read very well) Presumably the limit could be made dynamic. The limit could be max(static_limit, number of cores in cluster / # active jobs) On further reflection, I should note that my big jobs are all limited in pretty much the way that Doug suggests because they are processing a few large files that are unsplittable. This limits the number of slots these big jobs can eat up. The result is pretty OK. My little jobs with lots of maps can slide through the cracks most of the time and everything runs pretty well.
          Hide
          Doug Cutting added a comment -

          > The limit could be max(static_limit, number of cores in cluster / # active jobs)

          Jinx!

          Show
          Doug Cutting added a comment - > The limit could be max(static_limit, number of cores in cluster / # active jobs) Jinx!
          Hide
          Brice Arnould added a comment -

          The fix for bug 3412 also fix this one

          Show
          Brice Arnould added a comment - The fix for bug 3412 also fix this one
          Hide
          Tom White added a comment -

          I think this is covered by HADOOP-5170. If so, we can close this issue as a duplicate.

          Show
          Tom White added a comment - I think this is covered by HADOOP-5170 . If so, we can close this issue as a duplicate.
          Hide
          Allen Wittenauer added a comment -

          I'm going to close this out as a duplicate of MAPREDEUCE-5583.

          Show
          Allen Wittenauer added a comment - I'm going to close this out as a duplicate of MAPREDEUCE-5583.

            People

            • Assignee:
              Unassigned
              Reporter:
              Doug Cutting
            • Votes:
              0 Vote for this issue
              Watchers:
              10 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development