Uploaded image for project: 'Hadoop YARN'
  1. Hadoop YARN
  2. YARN-8127

Resource leak when async scheduling is enabled

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Critical
    • Resolution: Fixed
    • None
    • 3.2.0, 3.1.1, 2.10.2
    • None
    • None
    • Reviewed

    Description

      Brief steps to reproduce

      1. Enable async scheduling, 5 threads
      2. Submit a lot of jobs trying to exhaust cluster resource
      3. After a while, observed NM allocatedĀ resourceĀ is more than resource requested by allocated containers

      Looks like the commit phase is not sync handling reserved containers, causing some proposal incorrectly accepted, subsequently resource was deducted multiple times for a container.

      Attachments

        1. YARN-8127.branch-2.10.004.patch
          6 kB
          Eric Payne
        2. YARN-8127.004.patch
          6 kB
          Tao Yang
        3. YARN-8127.003.patch
          6 kB
          Tao Yang
        4. YARN-8127.002.patch
          7 kB
          Tao Yang
        5. YARN-8127.001.patch
          7 kB
          Tao Yang

        Activity

          People

            Tao Yang Tao Yang
            cheersyang Weiwei Yang
            Votes:
            0 Vote for this issue
            Watchers:
            10 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: