Hadoop Common
  1. Hadoop Common
  2. HADOOP-4943

fair share scheduler does not utilize all slots if the task trackers are configured heterogeneously

    Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: 0.19.0
    • Fix Version/s: 0.19.1
    • Component/s: None
    • Labels:
      None
    • Hadoop Flags:
      Reviewed
    • Release Note:
      Hide
      HADOOP-4943: Fixed fair share scheduler to utilize all slots when the task trackers are configured heterogeneously.
      Show
      HADOOP-4943 : Fixed fair share scheduler to utilize all slots when the task trackers are configured heterogeneously.

      Description

      There is some code in the fairshare scheduler that tries to make the load across the whole cluster the same.
      That piece of code will break if the task trackers are configured differently. Basically, we will stop assigning more tasks to tasks trackers that have tasks above the cluster average, but we may still want to do that because other task trackers may have less slots.

      We should change the code to maintain a cluster-wide slot usage percentage (instead of absolute number of slot usage) to make sure the load is evenly distributed.

      1. hadoop-4943-2.patch
        6 kB
        Matei Zaharia
      2. HADOOP-4943-1.patch
        5 kB
        Zheng Shao

        Activity

        Hide
        Matei Zaharia added a comment -

        I just committed this. Thanks Zheng!

        I also looked at the default and capacity schedulers, but the default scheduler already seems to have this logic as part of the patch for HADOOP-3136, and the capacity scheduler doesn't try to do this kind of load balancing when there are fewer tasks than slots so I think this should be a separate JIRA.

        Show
        Matei Zaharia added a comment - I just committed this. Thanks Zheng! I also looked at the default and capacity schedulers, but the default scheduler already seems to have this logic as part of the patch for HADOOP-3136 , and the capacity scheduler doesn't try to do this kind of load balancing when there are fewer tasks than slots so I think this should be a separate JIRA.
        Hide
        Zheng Shao added a comment -

        +1. Looks good to me.
        Do you want to fix the default scheduler (as you mentioned above) in the same transaction?

        Show
        Zheng Shao added a comment - +1. Looks good to me. Do you want to fix the default scheduler (as you mentioned above) in the same transaction?
        Hide
        Matei Zaharia added a comment -

        Here's a patch that includes a unit test.

        Show
        Matei Zaharia added a comment - Here's a patch that includes a unit test.
        Hide
        Matei Zaharia added a comment -

        Looks good to me. This is an issue that actually affects other schedulers too, because that "max load" code was taken from the implementation of the default scheduler. (Unless it has been fixed until then). Do you think it might be possible to create a unit test for this using the fake tasktrackers in the existing test class?

        Show
        Matei Zaharia added a comment - Looks good to me. This is an issue that actually affects other schedulers too, because that "max load" code was taken from the implementation of the default scheduler. (Unless it has been fixed until then). Do you think it might be possible to create a unit test for this using the fake tasktrackers in the existing test class?
        Hide
        Zheng Shao added a comment -

        Tested on an 8-node cluster with (4 map, 2 reduce) on half of the cluster, and (2 map, 4 reduces) on the other half.

        Tested with a streaming job of 24 zero-length input + 24 reducers (map/reduce='sleep 60'). We are able to schedule all of maps at the same time (also for reducers).

        Tested with a setting of 12, we are able to uniformly spread the load.

        Tested with a setting of 48, we are able to throttle at running 24 maps (instead of going beyond limit) (also for reducers).

        The test was done on hadoop 0.17.

        Show
        Zheng Shao added a comment - Tested on an 8-node cluster with (4 map, 2 reduce) on half of the cluster, and (2 map, 4 reduces) on the other half. Tested with a streaming job of 24 zero-length input + 24 reducers (map/reduce='sleep 60'). We are able to schedule all of maps at the same time (also for reducers). Tested with a setting of 12, we are able to uniformly spread the load. Tested with a setting of 48, we are able to throttle at running 24 maps (instead of going beyond limit) (also for reducers). The test was done on hadoop 0.17.

          People

          • Assignee:
            Zheng Shao
            Reporter:
            Zheng Shao
          • Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development