Uploaded image for project: 'Hadoop Map/Reduce'
  1. Hadoop Map/Reduce
  2. MAPREDUCE-6776

yarn.app.mapreduce.client.job.max-retries should have a more useful default

Log workAgile BoardRank to TopRank to BottomAttach filesAttach ScreenshotVotersWatch issueWatchersCreate sub-taskConvert to sub-taskMoveLinkCloneLabelsUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

    Details

    • Type: Improvement
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: 2.8.0
    • Fix Version/s: 2.9.0, 3.0.0-alpha2
    • Component/s: client
    • Labels:
      None
    • Hadoop Flags:
      Incompatible change, Reviewed
    • Release Note:
      The default value of yarn.app.mapreduce.client.job.max-retries has been changed from 0 to 3. This will help protect clients from failures that are transient. True failures may take slightly longer now due to the retries.

      Description

      The default is 0, so any communication failure results in a client failure. Oozie doesn't like that. If the RM is failing over and Oozie gets a communication failure, it assumes the target job has failed. I propose raising the default to something modest like 3 or 5. The default retry interval is 2s.

        Attachments

        1. MAPREDUCE-6776.003.patch
          6 kB
          Miklos Szegedi
        2. MAPREDUCE-6776.002.patch
          5 kB
          Miklos Szegedi
        3. MAPREDUCE-6776.001.patch
          4 kB
          Miklos Szegedi

          Activity

          $i18n.getText('security.level.explanation', $currentSelection) Viewable by All Users
          Cancel

            People

            • Assignee:
              miklos.szegedi@cloudera.com Miklos Szegedi Assign to me
              Reporter:
              templedf Daniel Templeton

              Dates

              • Created:
                Updated:
                Resolved:

                Issue deployment