Uploaded image for project: 'Hadoop YARN'
  1. Hadoop YARN
  2. YARN-261

Ability to fail AM attempts

    XMLWordPrintableJSON

Details

    • New Feature
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • 2.0.3-alpha
    • 2.8.0, 3.0.0-alpha1
    • api
    • None
    • Reviewed

    Description

      It would be nice if clients could ask for an AM attempt to be killed. This is analogous to the task attempt kill support provided by MapReduce.

      This feature would be useful in a scenario where AM retries are enabled, the AM supports recovery, and a particular AM attempt is stuck. Currently if this occurs the user's only recourse is to kill the entire application, requiring them to resubmit a new application and potentially breaking downstream dependent jobs if it's part of a bigger workflow. Killing the attempt would allow a new attempt to be started by the RM without killing the entire application, and if the AM supports recovery it could potentially save a lot of work. It could also be useful in workflow scenarios where the failure of the entire application kills the workflow, but the ability to kill an attempt can keep the workflow going if the subsequent attempt succeeds.

      Attachments

        1. 0001-YARN-261.patch
          60 kB
          Rohith Sharma K S
        2. 0002-YARN-261.patch
          68 kB
          Rohith Sharma K S
        3. YARN-261.patch
          64 kB
          Andrey Klochkov
        4. YARN-261--n2.patch
          66 kB
          Andrey Klochkov
        5. YARN-261--n3.patch
          66 kB
          Andrey Klochkov
        6. YARN-261--n4.patch
          71 kB
          Andrey Klochkov
        7. YARN-261--n5.patch
          60 kB
          Andrey Klochkov
        8. YARN-261--n6.patch
          83 kB
          Andrey Klochkov
        9. YARN-261--n7.patch
          76 kB
          Andrey Klochkov

        Issue Links

          Activity

            People

              rohithsharma Rohith Sharma K S
              jlowe Jason Darrell Lowe
              Votes:
              0 Vote for this issue
              Watchers:
              17 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: