Uploaded image for project: 'Hadoop YARN'
  1. Hadoop YARN
  2. YARN-8774

Memory leak when CapacityScheduler allocates from reserved container with non-default label

VotersWatch issueWatchersCreate sub-taskLinkCloneUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

Details

    Description

      The cause is that the RMContainerImpl instance of reserved container lost its node label expression, when scheduler reserves containers for non-default node-label requests, it will be wrongly added into LeafQueue#ignorePartitionExclusivityRMContainers and never be removed.

      To reproduce this memory leak:
      (1) create reserved container
      RegularContainerAllocator#doAllocation:  create RMContainerImpl instanceA (nodeLabelExpression="")
      LeafQueue#allocateResource:  RMContainerImpl instanceA is put into  LeafQueue#ignorePartitionExclusivityRMContainers
      (2) allocate from reserved container
      RegularContainerAllocator#doAllocation: create RMContainerImpl instanceB (nodeLabelExpression="test-label")
      (3) From now on, RMContainerImpl instanceA will be left in memory (be kept in LeafQueue#ignorePartitionExclusivityRMContainers) forever until RM restarted

      Attachments

        1. YARN-8774.001.patch
          6 kB
          Tao Yang
        2. YARN-8774.002.patch
          6 kB
          Tao Yang
        3. YARN-8774.branch-2.001.patch
          6 kB
          Tao Yang
        4. YARN-8774.branch-2.8.001.patch
          6 kB
          Tao Yang

        Activity

          This comment will be Viewable by All Users Viewable by All Users
          Cancel

          People

            Tao Yang Tao Yang
            Tao Yang Tao Yang
            Votes:
            0 Vote for this issue
            Watchers:
            8 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Slack

                Issue deployment