Details
-
Improvement
-
Status: Closed
-
Major
-
Resolution: Fixed
-
None
-
None
Description
The capacity scheduler in MR2 doesn't support delay scheduling for achieving node-level locality. So, jobs exhibit poor data locality even if they have good rack locality. Especially on clusters where disk throughput is much better than network capacity, this hurts overall job performance. We should optionally support node-level delay scheduling heuristics similar to what the fair scheduler implements in MR1.
Attachments
Attachments
Issue Links
- is related to
-
MAPREDUCE-4305 Implement delay scheduling in capacity scheduler for improving data locality
- Open
- supercedes
-
YARN-284 YARN capacity scheduler doesn't spread MR tasks evenly on an underutilized cluster
- Resolved