Uploaded image for project: 'Hadoop YARN'
  1. Hadoop YARN
  2. YARN-614

Separate AM failures from hardware failure or YARN error and do not count them to AM retry count

VotersWatch issueWatchersCreate sub-taskLinkCloneUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Closed
    • Major
    • Resolution: Fixed
    • None
    • 2.5.0
    • resourcemanager
    • None
    • Reviewed

    Description

      Attempts can fail due to a large number of user errors and they should not be retried unnecessarily. The only reason YARN should retry an attempt is when the hardware fails or YARN has an error. NM failing, lost NM and NM disk errors are the hardware errors that come to mind.

      Attachments

        1. YARN-614-6.patch
          25 kB
          Chris Riccomini
        2. YARN-614-5.patch
          25 kB
          Chris Riccomini
        3. YARN-614-4.patch
          25 kB
          Chris Riccomini
        4. YARN-614-3.patch
          5 kB
          Chris Riccomini
        5. YARN-614-2.patch
          5 kB
          Chris Riccomini
        6. YARN-614-1.patch
          14 kB
          Chris Riccomini
        7. YARN-614-0.patch
          9 kB
          Chris Riccomini
        8. YARN-614.9.patch
          15 kB
          Xuan Gong
        9. YARN-614.8.patch
          16 kB
          Xuan Gong
        10. YARN-614.7.patch
          22 kB
          Xuan Gong
        11. YARN-614.13.patch
          19 kB
          Jian He
        12. YARN-614.12.patch
          19 kB
          Xuan Gong
        13. YARN-614.11.patch
          19 kB
          Xuan Gong
        14. YARN-614.10.patch
          17 kB
          Xuan Gong

        Issue Links

        Activity

          This comment will be Viewable by All Users Viewable by All Users
          Cancel

          People

            xgong Xuan Gong
            bikassaha Bikas Saha
            Votes:
            0 Vote for this issue
            Watchers:
            23 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Slack

                Issue deployment