The executor interface shall deal with the resources allocated to the executor by cluster managers(Standalone, YARN, Kubernetes). The Executor either needs to be told the resources it was given or it needs to discover them in order for the executor to sync with the driver to expose available resources to support task scheduling.
Note this is part of a bigger feature for gpu-aware scheduling and is just how the executor find the resources. The general flow :
- users ask for a certain set of resources, for instance number of gpus - each cluster manager has a specific way to do this.
- cluster manager allocates a container or set of resources (standalone mode)
- When spark launches the executor in that container, the executor either has to be told what resources it has or it has to auto discover them.
- Executor has to register with Driver and tell the driver the set of resources it has so the scheduler can use that to schedule tasks that requires a certain amount of each of those resources