Uploaded image for project: 'Hadoop YARN'
  1. Hadoop YARN
  2. YARN-9921

Issue in PlacementConstraint when YARN Service AM retries allocation on component failure.

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • 3.1.0
    • 3.3.0, 3.1.4
    • None
    • None

    Description

      When YARN Service AM tries to relaunch a container on failure, we encounter the below error in PlacementConstraints.

      ERROR impl.AMRMClientAsyncImpl: Exception on heartbeat
      org.apache.hadoop.yarn.exceptions.YarnException: org.apache.hadoop.yarn.exceptions.SchedulerInvalidResoureRequestException: Invalid updated SchedulingRequest added to scheduler, we only allows changing numAllocations for the updated SchedulingRequest. Old=SchedulingRequestPBImpl{priority=0, allocationReqId=0, executionType={Execution Type: GUARANTEED, Enforce Execution Type: true}, allocationTags=[component], resourceSizing=ResourceSizingPBImpl{numAllocations=0, resources=<memory:557056, vCores:1>}, placementConstraint=notin,node,llap:notin,node,yarn_node_partition/=[label]} new=SchedulingRequestPBImpl{priority=0, allocationReqId=0, executionType={Execution Type: GUARANTEED, Enforce Execution Type: true}, allocationTags=[component], resourceSizing=ResourceSizingPBImpl{numAllocations=1, resources=<memory:557056, vCores:1>}, placementConstraint=notin,node,component:notin,node,yarn_node_partition/=[label]}, if any fields need to be updated, please cancel the old request (by setting numAllocations to 0) and send a SchedulingRequest with different combination of priority/allocationId
      

      But we can see from the message that the SchedulingRequest is indeed valid with everything same except numAllocations as expected. But still the below equals check in SingleConstraintAppPlacementAllocator fails.

      // Compare two objects
            if (!schedulingRequest.equals(newSchedulingRequest)) {
              // Rollback #numAllocations
              sizing.setNumAllocations(newNumAllocations);
              throw new SchedulerInvalidResoureRequestException(
                  "Invalid updated SchedulingRequest added to scheduler, "
                      + " we only allows changing numAllocations for the updated "
                      + "SchedulingRequest. Old=" + schedulingRequest.toString()
                      + " new=" + newSchedulingRequest.toString()
                      + ", if any fields need to be updated, please cancel the "
                      + "old request (by setting numAllocations to 0) and send a "
                      + "SchedulingRequest with different combination of "
                      + "priority/allocationId");
            }
      

      Attachments

        1. differenceProtobuf.png
          289 kB
          Tarun Parimi
        2. YARN-9921.001.patch
          4 kB
          Tarun Parimi

        Activity

          People

            tarunparimi Tarun Parimi
            tarunparimi Tarun Parimi
            Votes:
            0 Vote for this issue
            Watchers:
            6 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: