The scheduler could use a helper library to maintain node state and allowing matching/sorting queries. Several reasons for this:
- Today, a lot of the node state management is done separately in each scheduler. Having a single library will take us that much closer to reducing duplication among schedulers.
- Adding a filtering/matching API would simplify node labels and locality significantly.
- An API that returns a sorted list for a custom comparator would help YARN-1011 where we want to sort by allocation and utilization for continuous/asynchronous and opportunistic scheduling respectively.