Uploaded image for project: 'Hadoop YARN'
  1. Hadoop YARN
  2. YARN-2140 Add support for network IO isolation/scheduling for containers
  3. YARN-3443

Create a 'ResourceHandler' subsystem to ease addition of support for new resource types on the NM

    XMLWordPrintableJSON

Details

    • Sub-task
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • None
    • 2.8.0, 3.0.0-alpha1
    • nodemanager
    • None
    • Reviewed
    • Hide
      The current cgroups implementation is closely tied to supporting CPU as a resource . This patch separates out CGroups implementation into a reusable class as well as provides a simple ResourceHandler subsystem that will enable us to add support for new resource types on the NM - e.g Network, Disk etc.
      Show
      The current cgroups implementation is closely tied to supporting CPU as a resource . This patch separates out CGroups implementation into a reusable class as well as provides a simple ResourceHandler subsystem that will enable us to add support for new resource types on the NM - e.g Network, Disk etc.

    Description

      Today, support for CPU and memory as resources (on linux) are implemented in a way that cannot be easily extended to other new resource types (e.g network/disk). For example, some functionality cgroups functionality is implemented in LCE (mountCgroups) and the rest in CgroupsLCEResourcesHandler. CPU specific functionality is also implemented in CgroupsLCEResourcesHandler - using this handler automatically enables CPU as a resource. Some cgroups functionality requires elevated/super-user privileges and needs to be implemented via the container-executor binary. Implementing support for a new resource type in linux using the existing classes/mechanisms would be messy (for example, we might have to significantly modify/bloat CgroupsLCEResourceHandler). As an alternative, we have implemented a new ‘ResourceHandler’ mechanism that makes things cleaner and enables easier addition of new resource types. When adding support for a new resource type in the NM (from an isolation/enforcement perspective), there are three different pieces required :
      1) generic cgroups utilities that can be re-used across multiple resource handler ( e.g for CPU, Network, Disk). For example for net_cls we want to be able to create new cgroups, update cgroup params, read cgroup params etc.
      2) A mechanism to execute ‘PrivilegedOperation’s whose functionality requires super-user privileges and is implemented by container-executor binary
      3) Implementation that is specific to a resource type ( i.e network, disk would each have an implementation that provides isolation/enforcement for that resource type)

      Corresponding to the three pieces listed above, the patch for YARN-3443 provides the following :
      1) cgroups functionality that can be used across different resource types. CGroupsHandler.java specifies the interface and implementation is in CGroupsHandlerImpl.java . New cgroups controller types can be easily added to CGroupsHandler.java as and when necessary
      2) PrivilegedOperation.java and PrivilegedOperationExecutor.java wrap the container-executor binary and provide a way of executing operations that require elevated privileges. There are also utility functions that help ‘batching’ of certain kinds of operations in order to avoid multiple invocations of the container-executor binary
      3) ResourceHandler.java specifies an interface that custom resource handlers are expected to implement. This interface provides hooks for various operations during a container lifecyle - bootstrap, preStart, postComplete, reAcquire, teardown. Each of these hooks return a list of privileged operations - this is done so that the resulting set of privileged operations can be batched for performance reasons, if necessary. ResourceHandlerChain.java provides a simple chaining mechanism across multiple resource handlers. This is useful when multiple resource handlers are in place. They can be chained in sequence - e.g cpu, network, disk . A resource handler chain would hook in directly into LCE at various points in the container life cycle.

      Attachments

        1. YARN-3443.005.patch
          69 kB
          Sidharta Seethana
        2. YARN-3443.004.patch
          70 kB
          Sidharta Seethana
        3. YARN-3443.003.patch
          68 kB
          Sidharta Seethana
        4. YARN-3443.002.patch
          68 kB
          Sidharta Seethana
        5. YARN-3443.001.patch
          68 kB
          Sidharta Seethana

        Issue Links

          Activity

            People

              sidharta-s Sidharta Seethana
              sidharta-s Sidharta Seethana
              Votes:
              0 Vote for this issue
              Watchers:
              11 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: