Uploaded image for project: 'Hadoop YARN'
  1. Hadoop YARN
  2. YARN-11178

Avoid CPU busy idling and resource wasting in DelegationTokenRenewerPoolTracker thread

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Open
    • Minor
    • Resolution: Unresolved
    • 3.3.1, 3.3.2, 3.3.3, 3.3.4
    • None
    • resourcemanager, security
    • Hadoop 3.3.3 with Kerberos, Ranger 2.1.0, Hive 2.3.7 and Spark 3.0.3

    Description

      The DelegationTokenRenewerPoolTracker thread is busy wasting CPU resource in empty poll iterate when there is no delegation token renewer event task in the futures map:

      // org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer.DelegationTokenRenewerPoolTracker#run
      @Override
      public void run() {
        // this while true loop is busy when the `futures` is empty
        while (true) {
          for (Map.Entry<DelegationTokenRenewerEvent, Future<?>> entry : futures
              .entrySet()) {
            DelegationTokenRenewerEvent evt = entry.getKey();
            Future<?> future = entry.getValue();
            try {
              future.get(tokenRenewerThreadTimeout, TimeUnit.MILLISECONDS);
            } catch (TimeoutException e) {
      
              // Cancel thread and retry the same event in case of timeout
              if (future != null && !future.isDone() && !future.isCancelled()) {
                future.cancel(true);
                futures.remove(evt);
                if (evt.getAttempt() < tokenRenewerThreadRetryMaxAttempts) {
                  renewalTimer.schedule(
                      getTimerTask((AbstractDelegationTokenRenewerAppEvent) evt),
                      tokenRenewerThreadRetryInterval);
                } else {
                  LOG.info(
                      "Exhausted max retry attempts {} in token renewer "
                          + "thread for {}",
                      tokenRenewerThreadRetryMaxAttempts, evt.getApplicationId());
                }
              }
            } catch (Exception e) {
              LOG.info("Problem in submitting renew tasks in token renewer "
                  + "thread.", e);
            }
          }
        }
      }

      A better way to avoid CPU idling is waiting for some time when the `futures` map is empty, and when the renewer task done or cancelled, we should remove the task future in `futures` map to avoid memory leak:

      @Override
      public void run() {
        while (true) {
          // waiting for some time when futures map is empty
          if (futures.isEmpty()) {
            synchronized (this) {
              try {
                // waiting for tokenRenewerThreadTimeout milliseconds
                long waitingTimeMs = Math.min(10000, Math.max(500, tokenRenewerThreadTimeout));
                LOG.info("Delegation token renewer pool is empty, waiting for {} ms.", waitingTimeMs);
                wait(waitingTimeMs);
              } catch (InterruptedException e) {
                LOG.warn("Delegation token renewer pool tracker waiting interrupt occurred.");
                Thread.currentThread().interrupt();
              }
            }
            if (futures.isEmpty()) {
              continue;
            }
          }
          for (Map.Entry<DelegationTokenRenewerEvent, Future<?>> entry : futures
              .entrySet()) {
            DelegationTokenRenewerEvent evt = entry.getKey();
            Future<?> future = entry.getValue();
            try {
              future.get(tokenRenewerThreadTimeout, TimeUnit.MILLISECONDS);
            } catch (TimeoutException e) {
      
              // Cancel thread and retry the same event in case of timeout
              if (future != null && !future.isDone() && !future.isCancelled()) {
                future.cancel(true);
                futures.remove(evt);
                if (evt.getAttempt() < tokenRenewerThreadRetryMaxAttempts) {
                  renewalTimer.schedule(
                      getTimerTask((AbstractDelegationTokenRenewerAppEvent) evt),
                      tokenRenewerThreadRetryInterval);
                } else {
                  LOG.info(
                      "Exhausted max retry attempts {} in token renewer "
                          + "thread for {}",
                      tokenRenewerThreadRetryMaxAttempts, evt.getApplicationId());
                }
              }
            } catch (Exception e) {
              LOG.info("Problem in submitting renew tasks in token renewer "
                  + "thread.", e);
            }
            // remove done and cancelled task
            if (future.isDone() || future.isCancelled()) {
              try {
                futures.remove(evt);
                LOG.info("Removed done or cancelled renew tasks of {} in token renewer thread.", evt.getApplicationId());
              } catch (Exception e) {
                LOG.warn("Problem in removing done or cancelled renew tasks in token renewer thread.", e);
              }
            }
          }
        }
      } 

       

      Attachments

        Issue Links

          Activity

            People

              Unassigned Unassigned
              LennonChin Lennon Chin
              Votes:
              0 Vote for this issue
              Watchers:
              5 Start watching this issue

              Dates

                Created:
                Updated:

                Time Tracking

                  Estimated:
                  Original Estimate - Not Specified
                  Not Specified
                  Remaining:
                  Remaining Estimate - 0h
                  0h
                  Logged:
                  Time Spent - 1h 40m
                  1h 40m