Uploaded image for project: 'Hadoop YARN'
  1. Hadoop YARN
  2. YARN-6483

Add nodes transitioning to DECOMMISSIONING state to the list of updated nodes returned to the AM

    Details

    • Type: Improvement
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 3.1.0, 3.0.1
    • Component/s: resourcemanager
    • Labels:
      None

      Description

      The DECOMMISSIONING node state is currently used as part of the graceful decommissioning mechanism to give time for tasks to complete in a node that is scheduled for decommission, and for reducer tasks to read the shuffle blocks in that node. Also, YARN effectively blacklists nodes in DECOMMISSIONING state by assigning them a capacity of 0, to prevent additional containers to be launched in those nodes, so no more shuffle blocks are written to the node. This blacklisting is not effective for applications like Spark, because a Spark executor running in a YARN container will keep receiving more tasks after the corresponding node has been blacklisted at the YARN level. We would like to propose a modification of the YARN heartbeat mechanism so nodes transitioning to DECOMMISSIONING are added to the list of updated nodes returned by the Resource Manager as a response to the Application Master heartbeat. This way a Spark application master would be able to blacklist a DECOMMISSIONING at the Spark level.

        Attachments

        1. YARN-6483-v1.patch
          4 kB
          Juan Rodríguez Hortalá
        2. YARN-6483.002.patch
          48 kB
          Juan Rodríguez Hortalá
        3. YARN-6483.003.patch
          68 kB
          Juan Rodríguez Hortalá
        4. YARN-6483.branch-3.0.addendum.patch
          1 kB
          Arun Suresh

          Issue Links

            Activity

              People

              • Assignee:
                juanrh Juan Rodríguez Hortalá
                Reporter:
                juanrh Juan Rodríguez Hortalá
              • Votes:
                1 Vote for this issue
                Watchers:
                15 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: