Uploaded image for project: 'Hadoop YARN'
  1. Hadoop YARN
  2. YARN-7054 Yarn Service Phase 2
  3. YARN-6168

Restarted RM may not inform AM about all existing containers

    XMLWordPrintableJSON

    Details

    • Type: Sub-task
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 3.1.0
    • Component/s: None
    • Labels:
      None
    • Target Version/s:
    • Hadoop Flags:
      Reviewed

      Description

      There appears to be a race condition when an RM is restarted. I had a situation where the RMs and AM were down, but NMs and app containers were still running. When I restarted the RM, the AM restarted, registered with the RM, and received its list of existing containers before the NMs had reported all of their containers to the RM. The AM was only told about some of the app's existing containers.

        Attachments

        1. YARN-6168.004.patch
          27 kB
          Chandni Singh
        2. YARN-6168.003.patch
          27 kB
          Chandni Singh
        3. YARN-6168.002.patch
          27 kB
          Chandni Singh
        4. YARN-6168.001.patch
          25 kB
          Chandni Singh

          Issue Links

            Activity

              People

              • Assignee:
                csingh Chandni Singh
                Reporter:
                billie Billie Rinaldi
              • Votes:
                0 Vote for this issue
                Watchers:
                11 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: