Uploaded image for project: 'Hadoop YARN'
  1. Hadoop YARN
  2. YARN-3202

Improve master container resource release time ICO work preserving restart enabled

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Closed
    • Minor
    • Resolution: Not A Problem
    • None
    • None
    • resourcemanager
    • None

    Description

      While NM is registering with RM , If NM sends completed_container for masterContainer then immediately resources of master container are released by triggering the CONTAINER_FINISHED event. This releases all the resources held by master container and allocated for other pending resource requests by applications.

      But ICO rm work preserving restart is enabled, if master container state is completed then the attempt is not move to FINISHING as long as container expiry triggered by container livelyness monitor. I think in the below code, need not check for work preserving restart enable so that immediately master container resources get released and allocated to other pending resource requests of different applications

          // Handle received container status, this should be processed after new
          // RMNode inserted
          if (!rmContext.isWorkPreservingRecoveryEnabled()) {
            if (!request.getNMContainerStatuses().isEmpty()) {
              LOG.info("received container statuses on node manager register :"
                  + request.getNMContainerStatuses());
              for (NMContainerStatus status : request.getNMContainerStatuses()) {
                handleNMContainerStatus(status, nodeId);
              }
            }
          }
      

      Attachments

        1. 0001-YARN-3202.patch
          2 kB
          Rohith Sharma K S

        Activity

          People

            rohithsharma Rohith Sharma K S
            rohithsharma Rohith Sharma K S
            Votes:
            0 Vote for this issue
            Watchers:
            6 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: