Uploaded image for project: 'Hadoop Map/Reduce'
  1. Hadoop Map/Reduce
  2. MAPREDUCE-5241

FairScheduler preeempts tasks from the pool whose fair share is below the mininum quota

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Open
    • Major
    • Resolution: Unresolved
    • 0.20.2, 0.21.0, 0.22.0
    • None
    • contrib/fair-share
    • None

    Description

      The code snippet below is from class FairScheduler.UpdateThread:

          public void run() {
            while (running) {
              try {
                Thread.sleep(updateInterval);
                update();
                dumpIfNecessary();
                preemptTasksIfNecessary();
              } catch (Exception e) { 
                LOG.error("Exception in fair scheduler UpdateThread", e);
              }
            }
          }
      

      Suppose a pool A with the minimum shares of map slots set to 10. And a user submits a job with 5 maps to pool A and the 5 maps starts to run immediately. After update() in UpdateThread.run() is executed, pool A gets map shares of 5, and pool A has 5 running map tasks. Before preemptTasksIfNecessary() is called, a speculative map task could be started, and pool A has 6 running map tasks now, but still gets map shares of 5, so the new speculative map task could be preempted. For the number of running map tasks(6)is still less than the pool A's minimun shares(10), probably it is wrong to preempt any tasks from pool A. A possible fix is to make the call to update() and preemptTasksIfNecessary() atomic to eliminate the race condition, and the following is the first try:

          public void run() {
            while (running) {
              try {
                Thread.sleep(updateInterval);
                synchronized (taskTrackerManager) {
                  synchronized (FairScheduler.this) {
                    update();
                    dumpIfNecessary();
                    preemptTasksIfNecessary();
                  }
                }
              } catch (Exception e) {
                LOG.error("Exception in fair scheduler UpdateThread", e);
              }
            }
          }
      

      Another possible fix is to call FairScheduler.update() in FairScheduler.assignTasks() to re-calculate the fairs share.

      Any comments?

      Attachments

        Activity

          People

            Unassigned Unassigned
            angushe Angus He
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated: