Uploaded image for project: 'Hadoop YARN'
  1. Hadoop YARN
  2. YARN-9640

Slow event processing could cause too many attempt unregister events

    XMLWordPrintableJSON

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Critical
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 3.3.0, 3.2.1
    • Component/s: None
    • Labels:
    • Target Version/s:
    • Hadoop Flags:
      Reviewed

      Description

      We found in one of our test cluster verification that the number attempt unregister events is about 300k+.

      1. AM all containers completed.
      2. AMRMClientImpl send finishApplcationMaster
      3. AMRMClient check event 100ms the finish Status using finishApplicationMaster request.
      4. AMRMClientImpl#unregisterApplicationMaster
              while (true) {
                FinishApplicationMasterResponse response =
                    rmClient.finishApplicationMaster(request);
                if (response.getIsUnregistered()) {
                  break;
                }
                LOG.info("Waiting for application to be successfully unregistered.");
                Thread.sleep(100);
              }
        
      1. ApplicationMasterService finishApplicationMaster interface sends unregister events on every status update.

      We should send unregister event only once and cache event send , ignore and send not unregistered response back to AM not overloading the event queue.

        Attachments

        1. YARN-9640.001.patch
          6 kB
          Bibin Chundatt
        2. YARN-9640.002.patch
          6 kB
          Bibin Chundatt
        3. YARN-9640.003.patch
          5 kB
          Bibin Chundatt
        4. YARN-9640-branch-3.2.001.patch
          6 kB
          Bibin Chundatt

          Activity

            People

            • Assignee:
              bibinchundatt Bibin Chundatt
              Reporter:
              bibinchundatt Bibin Chundatt
            • Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: