Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-38447

Using Available Resources in Yarn Cluster Information in Spark Dynamic Allocation

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Open
    • Major
    • Resolution: Unresolved
    • 3.2.1
    • None
    • YARN
    • None

    Description

      Yarn Cluster Manager Provides information on available resources (VCores / Memory) in the Cluster via AM-RM heartbeat. In autoscaling, usually, latency to start executor containers on nodes that are immediately available in the cluster is comparatively lesser as compared to latency in adding new nodes to the cluster. Dynamic Allocation can leverage this information of the number of resources immediately available and latency in adding new nodes in deciding the number of executors to be requested from Yarn RM.

      This improvement can be built in two parts:

      1.  Infra to Send Available VCores and Memory information from Yarn AM-RM heartbeat response to ExecutorAllocationClient.
      2. Leveraging Available VCores and Memory information in ExecutorAllocationManger to decide the number of executors to be requested from Yarn RM.

      I'll create PRs for both the tasks one by one.

      Attachments

        Activity

          People

            Unassigned Unassigned
            abhishekd0907 Abhishek Dixit
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated: