Uploaded image for project: 'Hadoop YARN'
  1. Hadoop YARN
  2. YARN-4133

Containers to be preempted leak in FairScheduler preemption logic.

    XMLWordPrintableJSON

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Major
    • Resolution: Duplicate
    • Affects Version/s: 2.7.1
    • Fix Version/s: None
    • Component/s: fairscheduler
    • Labels:
      None

      Description

      Containers to be preempted leak in FairScheduler preemption logic. It may cause missing preemption due to containers in warnedContainers wrongly removed. The problem is in preemptResources:
      There are two issues which can cause containers wrongly removed from warnedContainers:
      Firstly missing the container state RMContainerState.ACQUIRED in the condition check:

      (container.getState() == RMContainerState.RUNNING ||
                    container.getState() == RMContainerState.ALLOCATED)
      

      Secondly if isResourceGreaterThanNone(toPreempt) return false, we shouldn't remove container from warnedContainers. We should only remove container from warnedContainers, if container is not in state RMContainerState.RUNNING, RMContainerState.ALLOCATED and RMContainerState.ACQUIRED.

            if ((container.getState() == RMContainerState.RUNNING ||
                    container.getState() == RMContainerState.ALLOCATED) &&
                    isResourceGreaterThanNone(toPreempt)) {
              warnOrKillContainer(container);
              Resources.subtractFrom(toPreempt, container.getContainer().getResource());
            } else {
              warnedIter.remove();
            }
      

      Also once the containers in warnedContainers are wrongly removed, it will never be preempted. Because these containers are already in FSAppAttempt#preemptionMap and FSAppAttempt#preemptContainer won't return the containers in FSAppAttempt#preemptionMap.

        public RMContainer preemptContainer() {
          if (LOG.isDebugEnabled()) {
            LOG.debug("App " + getName() + " is going to preempt a running " +
                "container");
          }
      
          RMContainer toBePreempted = null;
          for (RMContainer container : getLiveContainers()) {
            if (!getPreemptionContainers().contains(container) &&
                (toBePreempted == null ||
                    comparator.compare(toBePreempted, container) > 0)) {
              toBePreempted = container;
            }
          }
          return toBePreempted;
        }
      

        Attachments

        1. YARN-4133.000.patch
          3 kB
          Zhihai Xu

          Issue Links

            Activity

              People

              • Assignee:
                zxu Zhihai Xu
                Reporter:
                zxu Zhihai Xu
              • Votes:
                0 Vote for this issue
                Watchers:
                5 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: