Uploaded image for project: 'Hadoop YARN'
  1. Hadoop YARN
  2. YARN-8851

[Umbrella] A pluggable device plugin framework to ease vendor plugin development

    Details

    • Type: New Feature
    • Status: Open
    • Priority: Major
    • Resolution: Unresolved
    • Affects Version/s: None
    • Fix Version/s: None
    • Component/s: yarn
    • Labels:
      None

      Description

      At present, we support GPU/FPGA device in YARN through a native, coupling way. But it's difficult for a vendor to implement such a device plugin because the developer needs much knowledge of YARN internals. And this brings burden to the community to maintain both YARN core and vendor-specific code.

      Here we propose a new device plugin framework to ease vendor device plugin development and provide a more flexible way to integrate with YARN NM.

        Attachments

        1. [YARN-8851] YARN_New_Device_Plugin_Framework_Design_Proposal.pdf
          457 kB
          Zhankun Tang
        2. YARN-8851-WIP2-trunk.001.patch
          45 kB
          Zhankun Tang
        3. YARN-8851-WIP3-trunk.001.patch
          45 kB
          Zhankun Tang
        4. [YARN-8851] YARN_New_Device_Plugin_Framework_Design_Proposal-3.pdf
          457 kB
          Zhankun Tang
        5. YARN-8851-WIP4-trunk.001.patch
          48 kB
          Zhankun Tang
        6. YARN-8851-WIP5-trunk.001.patch
          96 kB
          Zhankun Tang
        7. YARN-8851-WIP6-trunk.001.patch
          97 kB
          Zhankun Tang
        8. YARN-8851-WIP7-trunk.001.patch
          111 kB
          Zhankun Tang
        9. YARN-8851-WIP8-trunk.001.patch
          128 kB
          Zhankun Tang
        10. YARN-8851-WIP9-trunk.001.patch
          131 kB
          Zhankun Tang
        11. [YARN-8851] YARN_New_Device_Plugin_Framework_Design_Proposal-4.pdf
          471 kB
          Zhankun Tang
        12. YARN-8851-trunk.001.patch
          137 kB
          Zhankun Tang
        13. YARN-8851-trunk.002.patch
          138 kB
          Zhankun Tang

          Issue Links

          1.
          Phase 1 - Add configurations for pluggable plugin framework Sub-task Resolved Zhankun Tang
          2.
          [YARN-8851] Add basic pluggable device plugin framework Sub-task Resolved Zhankun Tang
          3.
          [YARN-8851] Add a shared device mapping manager (scheduler) for device plugins Sub-task Resolved Zhankun Tang
          4.
          Phase 1 - Provide an example of fake vendor plugin Sub-task Resolved Zhankun Tang
          5.
          Support NM monitoring of device resource through plugin API Sub-task Open Zhankun Tang
          6.
          [DevicePlugin] Support NM APIs to query device resource allocation Sub-task Resolved Zhankun Tang
          7.
          Support RM API to query aggregated allocation across cluster Sub-task Open Zhankun Tang
          8.
          Support isolation in pluggable device framework Sub-task Resolved Zhankun Tang
          9.
          Support device topology scheduling Sub-task Resolved Zhankun Tang
          10.
          Add well-defined interface in container-executor to support vendor plugins isolation request Sub-task Resolved Zhankun Tang
          11.
          Port existing GPU module into pluggable device framework Sub-task Resolved Zhankun Tang
          12.
          Documentation of the pluggable device framework Sub-task Resolved Zhankun Tang
          13.
          Provide a way/a tool to do vendor plugin sanity-check outside of YARN NM Sub-task Open Zhankun Tang
          14.
          Move DockerCommandPlugin volume related APIs' invocation from DockerLinuxContainerRuntime#prepareContainer to #launchContainer Sub-task Resolved Zhankun Tang
          15.
          [DevicePlugin] Add an interface for device plugin to provide customized scheduler Sub-task Resolved Zhankun Tang
          16.
          [YARN-8851] Phase 1 - Support device isolation and use the Nvidia GPU plugin as an example Sub-task Resolved Zhankun Tang
          17.
          Fix the bug in DeviceMappingManager#getReleasingDevices Sub-task Resolved Zhankun Tang
          18.
          Fix the bug in DeviceMappingManager#getReleasingDevices Sub-task Resolved Zhankun Tang
          19.
          [YARN-8851] Improve debug message in device plugin method compatibility check of ResourcePluginManager Sub-task Resolved Zhankun Tang
          20.
          [YARN-8851] Fix a bug that lacking cgroup initialization when bootstrap DeviceResourceHandlerImpl Sub-task Resolved Zhankun Tang

            Activity

              People

              • Assignee:
                tangzhankun Zhankun Tang
                Reporter:
                tangzhankun Zhankun Tang
              • Votes:
                0 Vote for this issue
                Watchers:
                15 Start watching this issue

                Dates

                • Created:
                  Updated: