Uploaded image for project: 'Hadoop YARN'
  1. Hadoop YARN
  2. YARN-2915 Enable YARN RM scale out via federation using multiple RM's
  3. YARN-6127

Add support for work preserving NM restart when AMRMProxy is enabled

VotersWatch issueWatchersLinkCloneUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

Details

    • Incompatible change
    • Hide
      This breaks rolling upgrades because it changes the major version of the NM state store schema. Therefore when a new NM comes up on an old state store it crashes.

      The state store versions for this change have been updated in YARN-6798.
      Show
      This breaks rolling upgrades because it changes the major version of the NM state store schema. Therefore when a new NM comes up on an old state store it crashes. The state store versions for this change have been updated in YARN-6798 .

    Description

      YARN-1336 added the ability to restart NM without loosing any running containers. In a Federated YARN environment, there's additional state in the AMRMProxy to allow for spanning across multiple sub-clusters, so we need to enhance AMRMProxy to support work-preserving restart.

      Attachments

        1. YARN-6127-branch-2.v1.patch
          72 kB
          Botong Huang
        2. YARN-6127.v4.patch
          72 kB
          Botong Huang
        3. YARN-6127.v3.patch
          72 kB
          Botong Huang
        4. YARN-6127.v2.patch
          75 kB
          Botong Huang
        5. YARN-6127.v1.patch
          68 kB
          Botong Huang

        Issue Links

        Activity

          This comment will be Viewable by All Users Viewable by All Users
          Cancel

          People

            botong Botong Huang
            subru Subramaniam Krishnan
            Votes:
            0 Vote for this issue
            Watchers:
            7 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Slack

                Issue deployment