[MESOS-4424] Initial support for GPU resources. - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Epic
Status: Resolved
Priority: Major
Resolution: Fixed
Affects Version/s: None
Fix Version/s: 1.0.0
Component/s: containerization
Labels:
- mesosphere

Epic Name:
GPU

Description

Mesos already has generic mechanisms for expressing / isolating resources, and we'd like to expose GPUs as resources that can be consumed and isolated. However, GPUs present unique challenges:

Users may rely on vendor-specific libraries to interact with the device (e.g. CUDA, HSA, etc), others may rely on portable libraries like OpenCL or OpenGL. These libraries need to be available from within the container.
GPU hardware has many attributes that may impose scheduling constraints (e.g. core count, total memory, topology (via PCI-E, NVLINK, etc), driver versions, etc).
Obtaining utilization information requires vendor-specific approaches.
Isolated sharing of a GPU device requires vendor-specific approaches.

As such, the focus is on supporting a narrow initial use case: homogenous device-level GPU support:

Fractional sharing of GPU devices across containers will not be supported initially, unlike CPU cores.
Heterogeneity will be supported via other means for now (e.g. using agent attributes to differentiate hardware profiles, using portable libraries like OpenCL, etc).

Working group email list: https://groups.google.com/forum/#!forum/mesos-gpus

Attachments

Issue Links

is related to

MESOS-5377 Improve DRF behavior with scarce resources.

Accepted

MESOS-7080 Expose GPU hardware information to schedulers.

Open

supercedes

MESOS-2262 Adding GPGPU resource into Mesos, so we can know if any GPU/Heterogeneous resource are available from slave

Resolved

links to

Design Doc: GPGPU Resources in Mesos

mentioned in: Page Loading...; Page Loading...

(1 mentioned in)

Activity

People

Assignee:: Kevin Klues

Reporter:: Benjamin Mahler

Shepherd:: Benjamin Mahler

Votes:: 0 Vote for this issue

Watchers:: 36 Start watching this issue

Dates

Created:: 19/Jan/16 01:44

Updated:: 26/Apr/17 16:54

Resolved:: 20/Oct/16 22:18