Details
-
New Feature
-
Status: Resolved
-
Major
-
Resolution: Fixed
-
0.3.0-incubating
-
Tested on Hadoop Yarn 2.2 and 2.3 running on Ubuntu-nodes (4 GB , 8 Cores) cluster.
Description
Yarn AMRMClient provides API to control container placement. We need to enhance Twill API so that user can specify container placement policies. Twill could use AMRMClient to try allocating containers according to specified placement policy.
Added Placement Policy API in TwillSpecification. For now, the Placement policy type includes: (a) DISTRIBUTED, which tries to spawn specified runnables on different hosts. (b) DEFAULT, i.e. no special placement policy requirements.
Implementation Detail: The DISTRIBUTED runnable instances are provisioned one after another (as opposed to grouping provision requests in one allocate call based on ResourceSpecs). AM blacklists the hosts on which existing DISTRIBUTED runnables are running. If no container is provisioned for MAX_CONSTRAINED_PROVISION_ATTEMPTS number of attempts, AM relaxes blacklist constraints (or any other constraint).
Also, it make sense to specify Hosts and Racks through Placement Policy API instead of using Resource Specification. So, moved that logic into Placement Policy too.
Tests
1. Added unit tests to test Placement Policy (using MiniYarnCluster)
(a) Specify DISTRIBUTED runnables, runnables with Hosts and Racks in a Twill App and verify all constraints and appropriately honored.
(b) Specify DISTRIBUTED and DEFAULT runnables in a Twill App and verify all constraints and appropriately honored. Increase number of instances for all runnables and verify all constraints and appropriately honored.
(c) Tested DISTRIBUTED placement policy under stress (i.e. not enough resources available to honor constraints). Verify that AM relaxes constraints and try it's best.
2. Tested on a real Cluster.
Please review the API and changes in PR - https://github.com/apache/incubator-twill/pull/7