Uploaded image for project: 'Hadoop YARN'
  1. Hadoop YARN
  2. YARN-7054 Yarn Service Phase 2
  3. YARN-6168

Restarted RM may not inform AM about all existing containers

    XMLWordPrintableJSON

Details

    • Sub-task
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • None
    • 3.1.0
    • None
    • None
    • Reviewed

    Description

      There appears to be a race condition when an RM is restarted. I had a situation where the RMs and AM were down, but NMs and app containers were still running. When I restarted the RM, the AM restarted, registered with the RM, and received its list of existing containers before the NMs had reported all of their containers to the RM. The AM was only told about some of the app's existing containers.

      Attachments

        1. YARN-6168.004.patch
          27 kB
          Chandni Singh
        2. YARN-6168.003.patch
          27 kB
          Chandni Singh
        3. YARN-6168.002.patch
          27 kB
          Chandni Singh
        4. YARN-6168.001.patch
          25 kB
          Chandni Singh

        Issue Links

          Activity

            People

              csingh Chandni Singh
              billie Billie Rinaldi
              Votes:
              0 Vote for this issue
              Watchers:
              11 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: