Uploaded image for project: 'Hadoop YARN'
  1. Hadoop YARN
  2. YARN-8851

[Umbrella] A pluggable device plugin framework to ease vendor plugin development

VotersWatch issueWatchersCreate sub-taskLinkCloneUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

    Details

    • Type: New Feature
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 3.3.0
    • Component/s: yarn
    • Labels:
      None

      Description

      At present, we support GPU/FPGA device in YARN through a native, coupling way. But it's difficult for a vendor to implement such a device plugin because the developer needs much knowledge of YARN internals. And this brings burden to the community to maintain both YARN core and vendor-specific code.

      Here we propose a new device plugin framework to ease vendor device plugin development and provide a more flexible way to integrate with YARN NM.

        Attachments

        1. [YARN-8851] YARN_New_Device_Plugin_Framework_Design_Proposal.pdf
          457 kB
          Zhankun Tang
        2. [YARN-8851] YARN_New_Device_Plugin_Framework_Design_Proposal-3.pdf
          457 kB
          Zhankun Tang
        3. [YARN-8851] YARN_New_Device_Plugin_Framework_Design_Proposal-4.pdf
          471 kB
          Zhankun Tang
        4. YARN-8851-trunk.001.patch
          137 kB
          Zhankun Tang
        5. YARN-8851-trunk.002.patch
          138 kB
          Zhankun Tang
        6. YARN-8851-WIP2-trunk.001.patch
          45 kB
          Zhankun Tang
        7. YARN-8851-WIP3-trunk.001.patch
          45 kB
          Zhankun Tang
        8. YARN-8851-WIP4-trunk.001.patch
          48 kB
          Zhankun Tang
        9. YARN-8851-WIP5-trunk.001.patch
          96 kB
          Zhankun Tang
        10. YARN-8851-WIP6-trunk.001.patch
          97 kB
          Zhankun Tang
        11. YARN-8851-WIP7-trunk.001.patch
          111 kB
          Zhankun Tang
        12. YARN-8851-WIP8-trunk.001.patch
          128 kB
          Zhankun Tang
        13. YARN-8851-WIP9-trunk.001.patch
          131 kB
          Zhankun Tang

        Issue Links

        1.
        Phase 1 - Add configurations for pluggable plugin framework Sub-task Resolved Zhankun Tang Actions
        2.
        [YARN-8851] Add basic pluggable device plugin framework Sub-task Resolved Zhankun Tang Actions
        3.
        [YARN-8851] Add a shared device mapping manager (scheduler) for device plugins Sub-task Resolved Zhankun Tang Actions
        4.
        Phase 1 - Provide an example of fake vendor plugin Sub-task Resolved Zhankun Tang Actions
        5.
        Support NM monitoring of device resource through plugin API Sub-task Open Zhankun Tang Actions
        6.
        [DevicePlugin] Support NM APIs to query device resource allocation Sub-task Resolved Zhankun Tang Actions
        7.
        Support RM API to query aggregated allocation across cluster Sub-task Open Zhankun Tang Actions
        8.
        Support isolation in pluggable device framework Sub-task Resolved Zhankun Tang Actions
        9.
        Support device topology scheduling Sub-task Resolved Zhankun Tang Actions
        10.
        Add well-defined interface in container-executor to support vendor plugins isolation request Sub-task Resolved Zhankun Tang Actions
        11.
        Port existing GPU module into pluggable device framework Sub-task Resolved Zhankun Tang Actions
        12.
        Documentation of the pluggable device framework Sub-task Resolved Zhankun Tang Actions
        13.
        Provide a way/a tool to do vendor plugin sanity-check outside of YARN NM Sub-task Open Zhankun Tang Actions
        14.
        Move DockerCommandPlugin volume related APIs' invocation from DockerLinuxContainerRuntime#prepareContainer to #launchContainer Sub-task Resolved Zhankun Tang Actions
        15.
        [DevicePlugin] Add an interface for device plugin to provide customized scheduler Sub-task Resolved Zhankun Tang Actions
        16.
        [YARN-8851] Phase 1 - Support device isolation and use the Nvidia GPU plugin as an example Sub-task Resolved Zhankun Tang Actions
        17.
        Fix the bug in DeviceMappingManager#getReleasingDevices Sub-task Resolved Zhankun Tang Actions
        18.
        Fix the bug in DeviceMappingManager#getReleasingDevices Sub-task Resolved Zhankun Tang Actions
        19.
        [YARN-8851] Improve debug message in device plugin method compatibility check of ResourcePluginManager Sub-task Resolved Zhankun Tang Actions
        20.
        [YARN-8851] Fix a bug that lacking cgroup initialization when bootstrap DeviceResourceHandlerImpl Sub-task Resolved Zhankun Tang Actions

          Activity

            People

            • Assignee:
              tangzhankun Zhankun Tang
              Reporter:
              tangzhankun Zhankun Tang

              Dates

              • Created:
                Updated:
                Resolved:

                Issue deployment