Uploaded image for project: 'Hadoop YARN'
  1. Hadoop YARN
  2. YARN-11082

Use node label reosurce as denominator to decide which resource is dominated

    XMLWordPrintableJSON

Details

    Description

      We ued cluster resource as denominator to decide which resoure is dominated in AbstrctQueue#canAssignToThisQueue. Howere nodes in our cluster are configed differently.

      2021-12-09 10:24:37,069 INFO
      org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.allocator.AbstractContainerAllocator: assignedContainer application attempt=appattempt_1637412555366_1588993_000001 container=null queue=root.a.a1.a2 clusterResource=<memory:175117312, vCores:40222> type=RACK_LOCAL requestedPartition=x
      2021-12-09 10:24:37,069 DEBUG org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.AbstractCSQueue: Used resource=<memory:3381248, vCores:687> exceeded maxResourceLimit of the queue =<memory:3420315, vCores:687>

      2021-12-09 10:24:37,069 INFO org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler: Failed to accept allocation proposal

      We can find out that even thouth root.a.a1.a2 used 687/687 vcores, but the following code in AbstrctQueue#canAssignToThisQueue still return false

      Resources.greaterThanOrEqual(resourceCalculator, clusterResource,
      usedExceptKillable, currentLimitResource)

      clusterResource = <memory:175117312, vCores:40222>
      usedExceptKillable = <memory:3381248, vCores:687>
      currentLimitResource = <memory:3420315, vCores:687>

      currentLimitResource:
      memory : 3381248/175117312 = 0.01930847362
      vCores : 687/40222 = 0.01708020486

      usedExceptKillable:
      memory : 3384320/175117312 = 0.01932601615
      vCores : 688/40222 = 0.01710506687

      DRF will think memory is dominated resource and return false in this scenario

      Attachments

        1. YARN-11082.001.patch
          2 kB
          Bo Li

        Issue Links

          Activity

            People

              Unassigned Unassigned
              brightk Bo Li
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated:

                Time Tracking

                  Estimated:
                  Original Estimate - Not Specified
                  Not Specified
                  Remaining:
                  Remaining Estimate - 0h
                  0h
                  Logged:
                  Time Spent - 20m
                  20m