Uploaded image for project: 'Hadoop YARN'
  1. Hadoop YARN
  2. YARN-6153

keepContainer does not work when AM retry window is set

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • 2.7.1
    • 2.9.0, 3.0.0-alpha4
    • resourcemanager
    • None

    Description

      yarn.resourcemanager.am.max-attempts has been configured to 2 in my cluster.
      I submitted a YARN application (slider app) that keepContainers=true, attemptFailuresValidityInterval=300000.

      it did work properly when AM was failed firstly.
      all containers launched by previous AM were resynced with new AM (attempt2) without killing containers.

      after 10 minutes, I thought AM failure count was reset by attemptFailuresValidityInterval (5 minutes).
      but, all containers were killed when AM was failed secondly. (new AM attempt3 was launched properly)

      Attachments

        1. YARN-6153-branch-2.8.patch
          24 kB
          kyungwan nam
        2. YARN-6153.006.patch
          25 kB
          kyungwan nam
        3. YARN-6153.005.patch
          23 kB
          kyungwan nam
        4. YARN-6153.004.patch
          21 kB
          kyungwan nam
        5. YARN-6153.003.patch
          10 kB
          kyungwan nam
        6. YARN-6153.002.patch
          10 kB
          kyungwan nam
        7. YARN-6153.001.patch
          2 kB
          kyungwan nam

        Activity

          People

            kyungwan nam kyungwan nam
            kyungwan nam kyungwan nam
            Votes:
            0 Vote for this issue
            Watchers:
            8 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: