Uploaded image for project: 'Hadoop YARN'
  1. Hadoop YARN
  2. YARN-5077

Fix FSLeafQueue#getFairShare() for queues with zero fairshare

VotersWatch issueWatchersCreate sub-taskLinkCloneUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • None
    • 2.9.0, 3.0.0-alpha1
    • None
    • None
    • Reviewed

    Description

      1) When a queue's weight is set to 0.0, FSLeafQueue#getFairShare() returns <memory:0, vCores:0>
      2) When a queue's weight is nonzero, FSLeafQueue#getFairShare() returns <memory:16384, vCores:8>
      In case 1), that means no container ever gets allocated for an AM because from the viewpoint of the RM, there is never any headroom to allocate a container on that queue.

      For example, we have a pool with the following weights:

      • root.dev 0.0
      • root.product 1.0

      The root.dev is a best effort pool and should only get resources if root.product is not running. In our tests, with no jobs running under root.product, jobs started in root.dev queue stay stuck in ACCEPT phase and never start.

      Attachments

        1. YARN-5077.001.patch
          12 kB
          Yufei Gu
        2. YARN-5077.002.patch
          12 kB
          Yufei Gu
        3. YARN-5077.003.patch
          12 kB
          Yufei Gu
        4. YARN-5077.004.patch
          14 kB
          Yufei Gu
        5. YARN-5077.005.patch
          14 kB
          Yufei Gu
        6. YARN-5077.006.patch
          6 kB
          Yufei Gu
        7. YARN-5077.007.patch
          6 kB
          Yufei Gu
        8. YARN-5077.008.patch
          15 kB
          Yufei Gu
        9. YARN-5077.009.patch
          15 kB
          Yufei Gu
        10. YARN-5077.010.patch
          15 kB
          Yufei Gu

        Activity

          This comment will be Viewable by All Users Viewable by All Users
          Cancel

          People

            yufeigu Yufei Gu
            yufeigu Yufei Gu
            Votes:
            0 Vote for this issue
            Watchers:
            5 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Slack

                Issue deployment