Uploaded image for project: 'Hadoop YARN'
  1. Hadoop YARN
  2. YARN-6153

keepContainer does not work when AM retry window is set

    XMLWordPrintableJSON

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: 2.7.1
    • Fix Version/s: 2.9.0, 3.0.0-alpha4
    • Component/s: resourcemanager
    • Labels:
      None

      Description

      yarn.resourcemanager.am.max-attempts has been configured to 2 in my cluster.
      I submitted a YARN application (slider app) that keepContainers=true, attemptFailuresValidityInterval=300000.

      it did work properly when AM was failed firstly.
      all containers launched by previous AM were resynced with new AM (attempt2) without killing containers.

      after 10 minutes, I thought AM failure count was reset by attemptFailuresValidityInterval (5 minutes).
      but, all containers were killed when AM was failed secondly. (new AM attempt3 was launched properly)

        Attachments

        1. YARN-6153-branch-2.8.patch
          24 kB
          kyungwan nam
        2. YARN-6153.006.patch
          25 kB
          kyungwan nam
        3. YARN-6153.005.patch
          23 kB
          kyungwan nam
        4. YARN-6153.004.patch
          21 kB
          kyungwan nam
        5. YARN-6153.003.patch
          10 kB
          kyungwan nam
        6. YARN-6153.002.patch
          10 kB
          kyungwan nam
        7. YARN-6153.001.patch
          2 kB
          kyungwan nam

          Activity

            People

            • Assignee:
              kyungwan nam kyungwan nam
              Reporter:
              kyungwan nam kyungwan nam
            • Votes:
              0 Vote for this issue
              Watchers:
              8 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: