Details

    • Sub-task
    • Status: Resolved
    • Critical
    • Resolution: Fixed
    • 2.3.0
    • 2.8.0, 3.0.0-alpha1
    • resourcemanager
    • None
    • Reviewed

    Description

      Currently work preserving RM restart recovers unmanaged AMs but it has a couple of shortcomings - all running containers are killed and completed unmanaged AMs are also recovered as we do not record final state for unmanaged AMs in the RM StateStore. This JIRA proposes to address both the shortcomings so that work preserving unmanaged AM recovery works exactly like with managed AMs

      Attachments

        1. yarn-1815-1.patch
          3 kB
          Karthik Kambatla
        2. yarn-1815-2.patch
          3 kB
          Karthik Kambatla
        3. Unmanaged AM recovery.png
          149 kB
          Karthik Kambatla
        4. yarn-1815-2.patch
          4 kB
          Karthik Kambatla
        5. YARN-1815-v3.patch
          9 kB
          Subramaniam Krishnan
        6. YARN-1815-v4.patch
          10 kB
          Subramaniam Krishnan
        7. YARN-1815-v5.patch
          11 kB
          Subramaniam Krishnan
        8. YARN-1815-v6.patch
          13 kB
          Subramaniam Krishnan

        Issue Links

          Activity

            People

              subru Subramaniam Krishnan
              kasha Karthik Kambatla
              Votes:
              0 Vote for this issue
              Watchers:
              16 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: