[SPARK-27024] Executor interface for cluster managers to support GPU resources - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Story
Status: Resolved
Priority: Major
Resolution: Fixed
Affects Version/s: 3.0.0
Fix Version/s: 3.0.0
Component/s: Spark Core
Labels:
None

Epic Link:
GPU-aware Scheduling

Description

The executor interface shall deal with the resources allocated to the executor by cluster managers(Standalone, YARN, Kubernetes). The Executor either needs to be told the resources it was given or it needs to discover them in order for the executor to sync with the driver to expose available resources to support task scheduling.

Note this is part of a bigger feature for gpu-aware scheduling and is just how the executor find the resources. The general flow :

users ask for a certain set of resources, for instance number of gpus - each cluster manager has a specific way to do this.
cluster manager allocates a container or set of resources (standalone mode)
When spark launches the executor in that container, the executor either has to be told what resources it has or it has to auto discover them.
Executor has to register with Driver and tell the driver the set of resources it has so the scheduler can use that to schedule tasks that requires a certain amount of each of those resources

Attachments

Issue Links

blocks

SPARK-27363 Mesos support for GPU-aware scheduling

Open

SPARK-27360 Standalone cluster mode support for GPU-aware scheduling

Resolved

SPARK-27361 YARN support for GPU-aware scheduling

Resolved

SPARK-27362 Kubernetes support for GPU-aware scheduling

Resolved

SPARK-27488 Driver interface to support GPU resources

Resolved

is related to

SPARK-24615 SPIP: Accelerator-aware task scheduling for Spark

Resolved

SPARK-27005 Design sketch for SPIP discussion: Accelerator-aware scheduling

Resolved

links to

GitHub Pull Request #24394

GitHub Pull Request #24406

(2 is related to, 2 links to)

Activity

People

Assignee:: Thomas Graves

Reporter:: Xingbo Jiang

Votes:: 0 Vote for this issue

Watchers:: 10 Start watching this issue

Dates

Created:: 01/Mar/19 14:40

Updated:: 18/May/19 03:09

Resolved:: 14/May/19 13:48