Uploaded image for project: 'Hadoop YARN'
  1. Hadoop YARN
  2. YARN-8200

Backport resource types/GPU features to branch-3.0/branch-2

    XMLWordPrintableJSON

    Details

    • Type: Task
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 2.10.0
    • Component/s: None
    • Labels:
    • Target Version/s:
    • Release Note:
      Hide
      The generic resource types feature allows admins to configure custom resource types outside of memory and CPU. Users can request these resource types which YARN will take into account for resource scheduling.

      This also adds GPU as a native resource type, built on top of the generic resource types feature. It adds support for GPU resource discovery, GPU scheduling and GPU isolation.
      Show
      The generic resource types feature allows admins to configure custom resource types outside of memory and CPU. Users can request these resource types which YARN will take into account for resource scheduling. This also adds GPU as a native resource type, built on top of the generic resource types feature. It adds support for GPU resource discovery, GPU scheduling and GPU isolation.

      Description

      Currently we have a need for GPU scheduling on our YARN clusters to support deep learning workloads. However, our main production clusters are running older versions of branch-2 (2.7 in our case). To prevent supporting too many very different hadoop versions across multiple clusters, we would like to backport the resource types/resource profiles feature to branch-2, as well as the GPU specific support.

       

      We have done a trial backport of YARN-3926 and some miscellaneous patches in YARN-7069 based on issues we uncovered, and the backport was fairly smooth. We also did a trial backport of most of YARN-6223 (sans docker support).

       

      Regarding the backports, perhaps we can do the development in a feature branch and then merge to branch-2 when ready.

        Attachments

        1. YARN-8200-branch-2.003.patch
          857 kB
          Jonathan Hung
        2. YARN-8200-branch-2.002.patch
          857 kB
          Jonathan Hung
        3. YARN-8200-branch-3.0.001.patch
          357 kB
          Jonathan Hung
        4. YARN-8200-branch-2.001.patch
          773 kB
          Jonathan Hung
        5. counter.scheduler.operation.allocate.csv.gpuResources
          241 kB
          Jonathan Hung
        6. synth_sls.json
          3 kB
          Jonathan Hung
        7. counter.scheduler.operation.allocate.csv.defaultResources
          144 kB
          Jonathan Hung

          Issue Links

            Activity

              People

              • Assignee:
                jhung Jonathan Hung
                Reporter:
                jhung Jonathan Hung
              • Votes:
                0 Vote for this issue
                Watchers:
                18 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: