Uploaded image for project: 'Hadoop YARN'
  1. Hadoop YARN
  2. YARN-5139 [Umbrella] Move YARN scheduler towards global scheduler
  3. YARN-8546

Resource leak caused by a reserved container being released more than once under async scheduling

    Details

    • Target Version/s:
    • Hadoop Flags:
      Reviewed

      Description

      I was able to reproduce this issue by starting a job, and this job keeps requesting containers until it uses up cluster available resource. My cluster has 70200 vcores, and each task it applies for 100 vcores, I was expecting total 702 containers can be allocated but eventually there was only 701. The last container could not get allocated because queue used resource is updated to be more than 100%.

        Attachments

          Activity

            People

            • Assignee:
              Tao Yang Tao Yang
              Reporter:
              cheersyang Weiwei Yang
            • Votes:
              0 Vote for this issue
              Watchers:
              6 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: