Uploaded image for project: 'Hadoop YARN'
  1. Hadoop YARN
  2. YARN-556 [Umbrella] RM Restart phase 2 - Work preserving restart
  3. YARN-2674

Distributed shell AM may re-launch containers if RM work preserving restart happens

    XMLWordPrintableJSON

Details

    Description

      Currently, if RM work preserving restart happens while distributed shell is running, distribute shell AM may re-launch all the containers, including new/running/complete. We must make sure it won't re-launch the running/complete containers.
      We need to remove allocated containers from AMRMClientImpl#remoteRequestsTable once AM receive them from RM.

      Attachments

        1. YARN-2674.1.patch
          2 kB
          Chun Chen
        2. YARN-2674.2.patch
          3 kB
          Chun Chen
        3. YARN-2674.3.patch
          18 kB
          Chun Chen
        4. YARN-2674.4.patch
          23 kB
          Chun Chen
        5. YARN-2674.5.patch
          23 kB
          Chun Chen
        6. YARN-2674.6.patch
          6 kB
          Shane Kumpf
        7. YARN-2674.7.patch
          7 kB
          Shane Kumpf

        Activity

          People

            shanekumpf@gmail.com Shane Kumpf
            chenchun Chun Chen
            Votes:
            0 Vote for this issue
            Watchers:
            11 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: