Uploaded image for project: 'Hadoop YARN'
  1. Hadoop YARN
  2. YARN-7054 Yarn Service Phase 2
  3. YARN-6168

Restarted RM may not inform AM about all existing containers

VotersWatch issueWatchersLinkCloneUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

Details

    • Sub-task
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • None
    • 3.1.0
    • None
    • None
    • Reviewed

    Description

      There appears to be a race condition when an RM is restarted. I had a situation where the RMs and AM were down, but NMs and app containers were still running. When I restarted the RM, the AM restarted, registered with the RM, and received its list of existing containers before the NMs had reported all of their containers to the RM. The AM was only told about some of the app's existing containers.

      Attachments

        1. YARN-6168.004.patch
          27 kB
          Chandni Singh
        2. YARN-6168.003.patch
          27 kB
          Chandni Singh
        3. YARN-6168.002.patch
          27 kB
          Chandni Singh
        4. YARN-6168.001.patch
          25 kB
          Chandni Singh

        Issue Links

        Activity

          This comment will be Viewable by All Users Viewable by All Users
          Cancel

          People

            csingh Chandni Singh
            billie Billie Rinaldi
            Votes:
            0 Vote for this issue
            Watchers:
            11 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Slack

                Issue deployment