[YARN-6344] Add parameter for rack locality delay in CapacityScheduler - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Bug
Status: Resolved
Priority: Major
Resolution: Fixed
Affects Version/s: None
Fix Version/s: 2.9.0, 3.0.0-alpha4
Component/s: capacityscheduler
Labels:
None

Hadoop Flags:

Reviewed

Description

When relaxing locality from node to rack, the node-locality-parameter is used: when scheduling opportunities for a scheduler key are more than the value of this parameter, we relax locality and try to assign the container to a node in the corresponding rack.

On the other hand, when relaxing locality to off-switch (i.e., assign the container anywhere in the cluster), we are using a localityWaitFactor, which is computed based on the number of outstanding requests for a specific scheduler key, which is divided by the size of the cluster.
In case of applications that request containers in big batches (e.g., traditional MR jobs), and for relatively small clusters, the localityWaitFactor does not affect relaxing locality much.
However, in case of applications that request containers in small batches, this load factor takes a very small value, which leads to assigning off-switch containers too soon. This situation is even more pronounced in big clusters.
For example, if an application requests only one container per request, the locality will be relaxed after a single missed scheduling opportunity.

The purpose of this JIRA is to rethink the way we are relaxing locality for off-switch assignments.

Attachments

- Sort By Name
- Sort By Date
- Ascending
- Descending

YARN-6344.001.patch
23/Mar/17 01:10
27 kB
Konstantinos Karanasos
YARN-6344.002.patch
31/Mar/17 19:17
31 kB
Konstantinos Karanasos
YARN-6344.003.patch
31/Mar/17 22:59
31 kB
Konstantinos Karanasos
YARN-6344.004.patch
03/Apr/17 23:22
21 kB
Konstantinos Karanasos
YARN-6344-branch-2.8.patch
14/Apr/17 20:48
20 kB
Konstantinos Karanasos

Issue Links

duplicates

YARN-6289 Fail to achieve data locality when runing MapReduce and Spark on HDFS

Resolved

is related to

YARN-8965 Revisit delay scheduling for cloud environment

Open

YARN-4287 Capacity Scheduler: Rack Locality improvement

Resolved

Activity

People

Assignee:: Konstantinos Karanasos

Reporter:: Konstantinos Karanasos

Votes:: 0 Vote for this issue

Watchers:: 11 Start watching this issue

Dates

Created:: 15/Mar/17 19:37

Updated:: 01/Nov/18 05:07

Resolved:: 30/Jun/17 00:27