Uploaded image for project: 'Hadoop YARN'
  1. Hadoop YARN
  2. YARN-5773

RM recovery too slow due to LeafQueue#activateApplication()

    XMLWordPrintableJSON

Details

    • Reviewed

    Description

      1. Submit application 10K application to default queue.
      2. All applications are in accepted state
      3. Now restart resourcemanager

      For each application recovery LeafQueue#activateApplications() is invoked.Resulting in AM limit check to be done even before Node managers are getting registered.

      Total iteration for N application is about N(N+1)/2 for 10K application 50000000 iterations causing time take for Rm to be active more than 10 min.

      Since NM resources are not yet added to during recovery we should skip activateApplicaiton()

      Attachments

        1. YARN-5773.0001.patch
          7 kB
          Bibin Chundatt
        2. YARN-5773.0002.patch
          7 kB
          Bibin Chundatt
        3. YARN-5773.0004.patch
          4 kB
          Bibin Chundatt
        4. YARN-5773.0005.patch
          4 kB
          Bibin Chundatt
        5. YARN-5773.0006.patch
          4 kB
          Bibin Chundatt
        6. YARN-5773.0007.patch
          7 kB
          Bibin Chundatt
        7. YARN-5773.0008.patch
          8 kB
          Varun Saxena
        8. YARN-5773.0009.patch
          9 kB
          Bibin Chundatt
        9. YARN-5773-branch-2.8.0001.patch
          9 kB
          Bibin Chundatt

        Issue Links

          Activity

            People

              bibinchundatt Bibin Chundatt
              bibinchundatt Bibin Chundatt
              Votes:
              0 Vote for this issue
              Watchers:
              14 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: