Uploaded image for project: 'Hadoop Map/Reduce'
  1. Hadoop Map/Reduce
  2. MAPREDUCE-6776

yarn.app.mapreduce.client.job.max-retries should have a more useful default

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • 2.8.0
    • 2.9.0, 3.0.0-alpha2
    • client
    • None
    • Incompatible change, Reviewed
    • The default value of yarn.app.mapreduce.client.job.max-retries has been changed from 0 to 3. This will help protect clients from failures that are transient. True failures may take slightly longer now due to the retries.

    Description

      The default is 0, so any communication failure results in a client failure. Oozie doesn't like that. If the RM is failing over and Oozie gets a communication failure, it assumes the target job has failed. I propose raising the default to something modest like 3 or 5. The default retry interval is 2s.

      Attachments

        1. MAPREDUCE-6776.001.patch
          4 kB
          Miklos Szegedi
        2. MAPREDUCE-6776.002.patch
          5 kB
          Miklos Szegedi
        3. MAPREDUCE-6776.003.patch
          6 kB
          Miklos Szegedi

        Activity

          People

            miklos.szegedi@cloudera.com Miklos Szegedi
            templedf Daniel Templeton
            Votes:
            0 Vote for this issue
            Watchers:
            6 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: